- unwind ai
- Posts
- Agent Skills in Claude Code
Agent Skills in Claude Code
+ Google Veo 3.1 can generate 1-minute-long videos
Today’s top AI Highlights:
& so much more!
Read time: 3 mins
AI Tutorial
Imagine uploading a photo of your outdated kitchen and instantly getting a photorealistic rendering of what it could look like after renovation, complete with budget breakdowns, timelines, and contractor recommendations. That's exactly what we're building today.
In this tutorial, you'll create a sophisticated multi-agent home renovation planner using Google's Agent Development Kit (ADK) and Gemini 2.5 Flash Image (aka Nano Banana).
It analyzes photos of your current space, understands your style preferences from inspiration images, and generates stunning visualizations of your renovated room while keeping your budget in mind.
We share hands-on tutorials like this every week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.
Latest Developments
You can either build agents in code and lose non-technical collaboration, or use no-code builders and sacrifice developer control.
This AI agent builder ensures you give up none!
Inkeep lets you write agents in code, push them to a visual builder, let non-technical teams edit them there, then pull those changes back into your TypeScript files without breaking anything. The framework includes everything you need to deploy agents: a TypeScript SDK for building, a React UI library for chat interfaces, support for MCP and A2A, and templates for common use cases like customer support and documentation assistants.
You can deploy through Vercel or Docker, and agents expose standard endpoints compatible with the Vercel AI SDK.
Key Highlights:
True Bidirectional Sync - The CLI uses a shared representation layer to keep TypeScript and visual builder in sync, letting developers and non-technical teams work on same workflows without manual coordination or version conflicts.
Multi-Agent Coordination - Build systems where multiple agents work together, each with specific prompts and tools, and use sub-agents for direct user transfers that avoid routing through a central coordinator.
Multiple Integration Options - Expose agents through an MCP endpoint for use in Cursor/Claude/ChatGPT, use the Vercel AI SDK-compatible API for custom UIs, or communicate with other agent systems through the A2A protocol.
Full Observability Stack - Debug with comprehensive OTEL logs and a traces UI that shows every step of agent execution, plus context fetchers for dynamic data retrieval and artifacts for structured outputs.
The Simplest Way To Create and Launch AI Agents
Imagine if ChatGPT, Zapier, and Webflow all had a baby. That's Lindy.
With Lindy, you can build AI agents and apps in minutes simply by describing what you want in plain English.
From inbound lead qualification to AI-powered customer support and full-blown apps, Lindy has hundreds of agents that are ready to work for you 24/7/365.
Stop doing repetitive tasks manually. Let Lindy automate workflows, save time, and grow your business.
You've been setting Claude's personality with claude.md
and custom instructions.
Now you can give it specialized capabilities with Agent Skills, a new filesystem-based framework that lets you package your domain expertise into discoverable folders that Claude loads automatically when needed.
Think of Skills as onboarding guides for your agent: organized directories containing instructions, executable scripts, and reference materials that Claude can navigate progressively, loading only what's relevant to each specific task.
It works across Claude.ai, Claude Code, and the Claude API. There are pre-built Skills already available for document work (PowerPoint, Excel, Word, PDF) and you have full support for creating custom Skills. Instead of rebuilding the same capabilities for each use case, you can now compose specialized agents by capturing procedural knowledge once and letting Claude discover and apply it contextually.
Key Highlights:
Progressive Context Loading - Skills use three-tier disclosure where Claude first sees metadata (~100 tokens), then loads core instructions when triggered (under 5k tokens), and finally accesses bundled resources or executes scripts only as needed, keeping context usage lean while making skill content effectively unlimited.
Executable Code Integration - Skills can include Python scripts that Claude runs directly in its VM environment without loading the code into context, so operations like form validation or data extraction happen deterministically with only the output consuming tokens.
Availability - Pre-built Skills work immediately across Claude.ai and the API, while custom Skills can be created in Claude Code, uploaded via API for workspace-wide access, or added through claude.ai settings for individual use.
Structured Like Documentation - Each Skill requires a
SKILL.md
file with YAML frontmatter containing name and description for discovery, with the body holding procedural instructions and optional references to additional bundled files for context that's only needed in specific scenarios.
Quick Bites
Google Veo 3.1 can generate 1-minute-long videos
Three months and 275 million videos later, Google has dropped Veo 3.1, which builds upon its previous version with richer audio, more narrative control, and enhanced realism that captures true-to-life textures.
And this is a real flex - Veo 3.1 can extend clips to full minute-long shots, and create seamless transitions between any two frames. Here are the new capabilities the model brings:
Richer native audio generation with improved dialogue quality, sound effects synchronization, and better understanding of cinematic audio styles
Reference images support (up to 3) to maintain character, object, or style consistency across multiple video generations
Scene extension and frame transitions let you build minute-long sequences by extending clips or bridging two specific frames with smooth, natural transitions
Available now via Gemini API, Vertex AI, Google AI Studio, Google’s video editing platform Flow, and the Gemini app at the same pricing as Veo 3.
Can you believe we have a new SOTA in video gen every month now?
Claude 4.5 Haiku matches Sonnet 4 at 1/3 cost and 2x speed
Claude Haiku 4.5 is now available, delivering what Anthropic claims is near-frontier performance at one-third the cost. The small model runs 2x faster than Sonnet 4.5 while matching Sonnet 4 on coding benchmarks, even surpassing it on computer use tasks. The real play here is orchestration: Sonnet 4.5 creates multi-step plans while Haiku 4.5 agents execute subtasks in parallel, making agentic workflows more economically viable. It's now the default for free users and available via API at $1/$5 per million input/output tokens.
Apple M5 delivers 4x the peak GPU compute
Apple just released M5 chip that puts a Neural Accelerator inside each of its 10 GPU cores, an architectural shift that quadruples AI compute over the M4. The real story is the 153GB/s unified memory bandwidth enabling 32GB configurations, which means running 70B+ parameter models locally is now possible on a MacBook Pro. Shipping now in the 14-inch MacBook Pro, iPad Pro, and Vision Pro.
Andrew Ng’s free course on voice AI agents with Google’s ADK
DeepLearning.AI's latest course teaches you to build real-time voice agents using Google's open-source Agent Development Kit, covering everything from basic speech-to-speech interactions to multi-agent orchestration with planners and researchers. The 75-minute course walks through practical implementations, including a podcast generator that researches topics, scripts conversations, and produces multi-speaker audio with Gemini TTS. It’s a great hands-on course to strengthen your foundation and build agents you can take to production.
Google’s full-stack platform for always-on AI on edge devices
Google Research has released Coral NPU, an open-source full-stack hardware architecture designed to run AI models on edge devices like wearables and IoT gadgets without draining batteries. The platform tackles the classic edge AI trilemma: performance limitations, fragmented tooling, and privacy concerns, through a RISC-V-based design that prioritizes ML operations over traditional CPU logic. Full specs and developer tools are live on GitHub.
Manus AI is now 4x faster and 15% better across tasks
Manus AI just dropped their v1.5, built on a re-architected engine that makes everything faster and more reliable. Manus 1.5 is now overall 4x faster and 15% better across tasks. Another major enhancement in this release is full-stack web application development. Manus can create full-stack applications with persistent backends, databases, user authentication, and embedded AI capabilities with just simple prompts. Manus-1.5 is available to subscribers, and a Lite version of this agent is available for all users now.
Tools of the Trade
Dexter - Open source autonomous financial research agent that thinks, plans, and learns as it works. It takes complex financial questions and turns them into clear, step-by-step research plans, runs those tasks using live market data, checks its own work, and refines the results until it has a confident, data-backed answer.
Wispbit - Linter for AI coding agents. It automatically extracts coding standards from your existing codebase and PR comments, then enforces them during code review or via CLI, using a multi-agent system. Their early users are seeing 80%+ resolution rates.
Scriber Pro - A $3.99 macOS app that transcribes audio and video files locally using AI, processing a 4.5-hour video in ~3.5 minutes. It supports common media formats and exports to multiple document and subtitle formats including SRT, VTT, PDF, DOCX, and Markdown.
Claude Code Plugins - A plugins marketplace containing 63 specialized plugins that package 85 AI agents, 15 workflow orchestrators, and 44 development tools. Each plugin loads only its specific resources (agents, commands) into context when installed, avoiding token waste. Covers everything from backend development and security scanning to ML pipelines and documentation generation.
Awesome LLM Apps - A curated collection of LLM apps with RAG, AI Agents, multi-agent teams, MCP, voice agents, and more. The apps use models from OpenAI, Anthropic, Google, and open-source models like DeepSeek, Qwen, and Llama that you can run locally on your computer.
(Now accepting GitHub sponsorships)
Hot Takes
Text messaging with AI is the next form factor to hit 1 billion users, I have 100% conviction on this.
The fact that every AI company isn’t doing this with urgency is absurd.
kind of crazy that each LLM company has a AI router which basically classifies you based on how smart you are. you only get PhD level intelligence if you have a PhD level brain. why waste compute on a retard
~ kache
That’s all for today! See you tomorrow with more such AI-filled content.
Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!
PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉
Reply