- unwind ai
- Posts
- Claude AI Agent in Google Chrome
Claude AI Agent in Google Chrome
PLUS: Grok Code debuts for free, Google Gemini image goes bananas
Today’s top AI Highlights:
& so much more!
Read time: 3 mins
AI Tutorial
Building targeted B2B outreach campaigns is one of the most time-consuming aspects of sales and marketing. The challenge isn't just finding companies; it's discovering the right decision-makers, researching genuine insights, and crafting personalized messages that actually get responses.
In this tutorial, we'll build a multi-agent AI email outreach system using OpenAI GPT-5, Agno for orchestrating agents, and Exa AI for intelligent web search. This system automates the entire outreach pipeline - from company discovery to personalized email generation - delivering professional, research-backed outreach emails in minutes instead of hours.
Our multi-agent system conducts real research on each company using website content and Reddit discussions and ensures every email feels genuinely personalized.
We share hands-on tutorials like this every week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.
Latest Developments
You give Claude Code a simple task and walk away, only to return 20 minutes later to find it asking "Can I edit this file?".
Here’s another one: Your Claude Code workflow is throwing a vague prompt at the wall, let it break something, then spend more time fixing it than if you'd just coded it yourself.
A developer decided to fix this and built Async, an open-source tool that combines AI coding with task management and code review. Async integrates Claude Code + Linear + GitHub PRs into one opinionated workflow.
The tool forces planning upfront by researching your codebase and asking clarifying questions before touching any code. It runs entirely in isolated cloud environments, breaking work into reviewable subtasks with focused diffs that make code review manageable instead of overwhelming. Built specifically for experienced developers who need AI that works on mature codebases, not just greenfield demos.
Key Highlights:
No permission theater - Handles planning, execution, and code changes without stopping to ask if it can breathe, designed for developers who want results, not conversations.
Background execution - Runs tasks in isolated cloud jobs while you work on other features, supporting real parallel development without the context switching nightmare.
Stack diff reviews - Built-in code review with focused, subtask-specific diffs where you can iterate on changes without leaving the app or losing your mind.
GitHub issue automation - Imports open issues as tasks with simple tracking that doesn't require learning another PM tool or dealing with enterprise bloat.
Google's mystery child just got a proper introduction after weeks of blowing minds in stealth mode. The cryptic "Nano Banana" that dominated AI forums and topped LMArena charts is now officially out as Gemini 2.5 Flash Image.
Gemini 2.5 Flash Image brings state-of-the-art image generation and editing capabilities to everyone, specifically keeping characters consistent throughout multiple edits.
Upload a photo of yourself, your pet, or any subject, and the model keeps them looking identical across different scenarios, backgrounds, and transformations. Blend multiple images into a single image, make targeted edits from simple prompts - the use cases are endless.
The model tops LMArena Leaderboard with an ELO score of 1362, beating the second-best model from FLUX by a staggering 171 points.
Key Highlights:
Character Consistency - Maintains identical subject appearance across multiple edits and transformations, solving the biggest pain point in AI image generation.
Multi-Image - Seamlessly blends multiple input images and applies textures or elements from one image to completely different subjects.
World Knowledge - Leverages Gemini's broader understanding to create contextually accurate images, making it useful beyond just aesthetic generation.
Availability - The model is available completely for free for all users on the Gemini app. It is also available via the Gemini API and Google AI Studio for developers, and Vertex AI for enterprise. Google has also partnered with OpenRouter.ai and fal.ai to bring the model to their platforms.
Anthropic just dropped Claude for Chrome, a browser extension that puts Claude directly in your Chrome sidebar, where it can see your screen and take actions across websites.
Much like Perplexity Comet, this extension enables Claude to work directly in a side panel while you browse, seeing what you see and taking actions when you ask, maintaining full context across all browser activities. It doesn’t just chat with you - it can manage calendars, schedule meetings, draft email responses, handle routine expense reports, and navigate websites within your browser, using your accounts, on your behalf.
The extension is being rolled out with controlled testing where trusted users can instruct Claude to take actions on their behalf within the browser. Currently in pilot with 1,000 Max plan users, you can join the waitlist too.
Key Highlights:
Sidebar-based - Claude appears in Chrome's side panel where it stays visible while you browse, able to read, click, and navigate websites alongside your regular browsing.
Transparency - This one’s where Anthropic shone this time: the team didn’t shy away from admitting that these browser AI agents are prone to prompt injection attacks, and in internal testing, the attack success rate was over 23%.
Granular access controls - To control these attacks, the extension uses a permission system where users can approve individual actions or grant ongoing access to specific websites, with built-in blocks for sensitive categories.
If you’d like to take part in the Pilot, you can join the Claude for Chrome research preview waitlist at claude.ai/chrome. Once you have access, you can install the extension from the Chrome Web Store and authenticate with your Claude credentials.
Find out why 1M+ professionals read Superhuman AI daily.
In 2 years you will be working for AI
Or an AI will be working for you
Here's how you can future-proof yourself:
Join the Superhuman AI newsletter – read by 1M+ people at top companies
Master AI tools, tutorials, and news in just 3 minutes a day
Become 10X more productive using AI
Join 1,000,000+ pros at companies like Google, Meta, and Amazon that are using AI to get ahead.
Quick Bites
Grok Code debuts on Cursor, GitHub Copilot, and Opencode
xAI quietly launched Grok Code Fast 1, a blazing-fast reasoning model optimized for coding, that was initially deployed under the codename "Sonic". There’s still no official announcement from xAI, but the model is available on GitHub Copilot, Cursor, Opencode, and OpenRouter for free until early September. We have been testing the model too, and it’s actually fast and good! (Demo coming soon on X) An impressive speed of 92 tokens per second with visible reasoning traces to steer its outputs. Here are some more model details:
Context window: 256K tokens, max output: 10K tokens
Supports function calling, structured outputs, and handles large codebases effectively
Works best with detailed prompts and excels at refactoring multi-threaded systems
Open-source Model to Generate Videos from an Image and Audio
Alibaba Wan just dropped Wan2.2-S2V-14B, their latest video generation model that transforms single images and audio inputs into cinematic-quality videos with synchronized lip movements, character consistency, and natural expressions. The 14B parameter model excels at generating film-quality dialogue scenes, singing performances, and character animations, outperforming existing models and methods across key metrics. Available under Apache 2.0 license with full inference code and model weights on Hugging Face and Modelscope.
Microsoft’s open-source TTS to generate 90-minute-long audio
Microsoft has open-sourced VibeVoice, a TTS model that handles what most speech synthesis systems struggle with: generating up to 90 minutes of multi-speaker conversational audio with natural turn-taking and speaker consistency. Available in 1.5B and 7B parameter variants, it uses a hybrid approach combining LLM dialogue understanding with diffusion-based acoustic generation. Perfect timing for anyone building podcast generators or long-form audio apps.
Tools of the Trade
ContextForge MCP Gateway: A feature-rich gateway, proxy, and MCP Registry that federates MCP and REST services - unifying discovery, auth, rate-limiting, observability, virtual servers, multi-transport protocols, and an optional Admin UI into one clean endpoint for your AI clients.
Sync: Syncs lip movements of any character in any video to match any audio, whether you’re working on movies, podcasts, games, or animations, with just an API. It is extremely natural, preserving the character’s expressions and unique features accurately without any training or fine-tuning.
Markdown UI: Transforms standard Markdown text into interactive user interfaces with clickable elements, forms, and submission capabilities. Copy-paste from and into an LLM prompt and use it instantly. Works in React, Svelte, and Vue, wherever Markdown is used.
Smooth: AI browser agent API that’s 5x faster and 7x cheaper than Browser Use. This happens by restructuring how webpage context is fed to LLMs and code-based actions over traditional tool calling. Run a task in just 4 lines of code, making it easy to integrate into your workflow.
Awesome LLM Apps: A curated collection of LLM apps with RAG, AI Agents, multi-agent teams, MCP, voice agents, and more. The apps use models from OpenAI, Anthropic, Google, and open-source models like DeepSeek, Qwen, and Llama that you can run locally on your computer.
(Accepting GitHub sponsorships now)
Hot Takes
I miss o3
It was a great model, and excellent as a default chatbot. GPT-5 just seems too impartial and unopinionated. o3 had opinions and was unafraid of speaking its mind ~
Aarush Sahwhy did they call it waymo and not google drive ~
tweet davidson
That’s all for today! See you tomorrow with more such AI-filled content.
Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!
PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉
Reply