- unwind ai
- Posts
- Google's Gemini 3 and Agentic AI IDE
Google's Gemini 3 and Agentic AI IDE
+ Manus Browser Operator, Gemini Agent to execute tasks autonomously
Today’s top AI Highlights:
& so much more!
Read time: 3 mins
AI Tutorial
Google's free 5-day AI Agents Intensive course ends today!
This course by Google's ML researchers covers everything from agent foundations to production-ready systems.
And they released 5 whitepapers (~300 pages total) that are staying up for free.
Here's what the whitepapers cover:
The 5 levels of agents: from pure reasoning to self-evolving systems
Core components: Model (the brain), Tools (the hands), Orchestration (the nervous system)
Moving from prototype to enterprise-grade systems
How tools actually work and why descriptions matter more than code
The Model Context Protocol architecture explained
Security risks nobody talks about: dynamic capability injection, tool shadowing, confused deputy problems
The difference between prompt engineering vs context engineering
Session management: conversation history + working memory
Memory as an active curation system, not just "save the conversation"
The four pillars: Effectiveness, Efficiency, Robustness, Safety
Process evaluation: judging reasoning, not just outputs
Building agents that learn from production failures
Evaluation gates, circuit breakers, and evolution loops
Turning demos into production systems
Real-time monitoring and continuous evaluation
The best part?
All whitepapers are 100% free and packed with zero fluff.

Latest Developments
The latest frontier model from Google just landed, and we are all getting access to capabilities that weren't possible six months ago. Gemini 3 comes with advanced reasoning, agentic capabilities, and the best multimodal understanding to help you learn, build, and plan anything.
Gemini 3 scores higher than previous models on every major benchmark, but what matters for developers is how it handles real work - autonomous coding, complex multimodal reasoning, and agentic workflows. We have been extensively testing the model ourselves and can confidently say that it’s the best model to date for multimodal understanding and vibe coding yet. Here’s a thread on how we experimented with it. See the results yourself!
It's available now in preview via the Gemini API at $2 per million input tokens and $12 per million output tokens for prompts under 200k tokens, with free access (rate-limited) in Google AI Studio, Vertex AI, Gemini app, Gemini CLI, and other products in the Google ecosystem.
Key Highlights:
LMArena Leaderboard - Tops the LMArena Leaderboard with a breakthrough score of 1501 Elo, the highest ranking achieved by any model. Also leads WebDev Arena with 1487 Elo for web development tasks.
SOTA Reasoning, Math, and Science - Gemini 3 shatters all frontier models in the toughest benchmarks, including the Humanity’s Last Exam (45.8% with tools), ARC-AGI-2, GPQA, AIME 2025, and more.
Multimodal Understanding - The model is the best-in-class for understanding complex documents beyond basic OCR, videos with high-frame-rate and long-context recall, and spatial reasoning for screen understanding, trajectory prediction, and embodied AI tasks.
Vibe Coding, Agentic tool use, and Long-term planning - We have officially entered the era of vibe coding, and Gemini 3 unlocks the true potential of it. It achieves the highest scores on Terminal-Bench 2.0, SWE-Bench Verified for real-world software engineering, t2-bench for agentic tool use, and $5,478.16 profit on Vending-Bench 2 for long-horizon planning tasks.
Gemini 3 Developer Guide - The team has written a new Developer Guide, including all new API features, Migration strategies, and technical details for building with Gemini 3 Pro preview. Learn how to use the
thinking_levelparameter for controlling reasoning depth,media_resolutioncontrol for images, PDFs, and video, Structured Outputs, built-in tools, and more.
You can feel the power of this model come to life in Build in the AI Studio, Gemini app, Vertex AI, Gemini CLI, Android Studio, and other coding products like Cursor, GitHub, JetBrains, Manus, Cline, and more.
Oh, not over here! Pushing the intelligence even further is Gemini 3 Deep Think, that outperforms Gemini 3 Pro’s already impressive performance on HLE (41.0% without tool use) and GPQA Diamond (93.8%). It also achieves an unprecedented 45.1% on ARC-AGI-2 (with code execution, ARC Prize Verified). Ultra users can expect access to Deep Think soon.
WhatsApp Business Calls, Now in Synthflow
Billions of customers already use WhatsApp to reach businesses they trust. But here’s the gap: 65% still prefer voice for urgent issues, while 40% of calls go unanswered — costing $100–$200 in lost revenue each time. That’s trust and revenue walking out the door.
With Synthflow, Voice AI Agents can now answer WhatsApp calls directly, combining support, booking, routing, and follow-ups in one conversation.
It’s not just answering calls — it’s protecting revenue and trust where your customers already are.
One channel, zero missed calls.
Google has just launched its new AI IDE, Antigravity, that feels like a mission control for multiple AI agents. Powered by Gemini 3, it's built as agent-first and fundamentally rethinks how developers should work with AI agents to build software: you, as the main architect, orchestrating and observing multiple agents working across different workspaces simultaneously.
The text editor part comes with the standard yet powerful auto-complete and inline instructions via Tab and Command.
The platform treats agents as first-class citizens rather than embedded assistants, enabling you to assign tasks and interact asynchronously while agents handle multi-step workflows that include writing code, launching localhost, and testing features in the browser.
Antigravity is available now in public preview at no charge, with support for MacOS, Linux, and Windows, plus access to Gemini 3 Pro, Claude Sonnet 4.5, and OpenAI's GPT-OSS models.
Key Highlights:
Agent Manager Surface - This is a purpose-built interface for spawning and managing multiple agents across workspaces in parallel. It has an inbox system for progress notifications and a chat interface to start a conversation and explore ideas instantly. It seamlessly handoffs between the Manager and Editor views for both synchronous and asynchronous workflows.
Autonomous Browser - Agents autonomously use the browser to test features, interact with dashboards, perform source control actions, and validate UI changes.
Task-Level Artifacts - Agents communicate through tangible deliverables like implementation plans, task lists, walkthroughs, screenshots, and browser recordings instead of raw tool calls, making it significantly easier to validate agent work and provide feedback without stopping execution.
Built-in Knowledge Management - The platform maintains a knowledge base that agents both retrieve from and contribute to, enabling them to learn from past work including code snippets, architectural patterns, and successful task completion strategies across your projects.
Set Up and Get Started - The setup is pretty quick. You can either import your settings from VS Code or start afresh. Log in from your Google account and you’re good to go! The UI is very familiar to VS Code, so there’s literally no learning curve.
Quick Bites
Dynamic visuals and generative UI in AI Mode with Gemini 3
Google's rolling out Gemini 3 in Search from day one, starting with AI Mode for Pro and Ultra subscribers. The model's advanced reasoning powers a revamped query fan-out system that surfaces more relevant content. This one’s a big kicker: generative UI that creates dynamic bespoke layouts with interactive elements. Think real-time coded simulations and tools tailored to your query, whether you're exploring gravitational physics or comparing loan scenarios - all embedded directly in search responses.
Gemini Agent for executing tasks autonomously across Workspace
With the new model, the Gemini app is getting a bigger agentic upgrade. It now comes with an experimental Gemini agent that handles multi-step tasks across your Workspace, like parsing flight details from Gmail to comparison-shop rental cars under your budget, then queuing the booking for approval. The team has also integrated Google’s 50 billion product Shopping Graph directly into chat for comparisons and pricing. And lastly, a little treat from Google: a free year of Google AI Pro to U.S. college students. Go and grab the offer today!
Turn any browser into an autonomous agentic AI browser
Manus just launched Browser Operator, a Chrome extension that lets their AI agent work directly in your local browser instead of a cloud sandbox. The key advantage: it can now interact with services you're already logged into - your CRM, premium research tools, authenticated platforms - without triggering login walls or CAPTCHAs since everything runs through your existing sessions and local IP. You authorize each task, watch it execute in a dedicated tab, and can kill it anytime by closing that tab. Rolling out in beta to paid users on Chrome and Edge.
Small character-level diffusion model you can run locally
If you've wanted to understand text diffusion without dealing with billion-parameter models, this implementation gets you there in minutes. Nathan Barry released a 10.7M parameter diffusion model for text generation that's small enough to run locally and comes with pre-trained weights. It's trained on Tiny Shakespeare and includes visualization scripts that show the denoising process step-by-step. Great project to explore how these models work!
The first fully open, end-to-end recipe for Deep Research agent
AI2 just dropped Deep Research Tulu, the first fully open recipe for training long-form research agents, complete with an 8B model that actually works. The training uses "Reinforcement Learning with Evolving Rubrics" (RLER), where the reward function adapts as the model learns instead of relying on static evaluation criteria. DR Tulu-8B beats larger open models like WebThinker-32B on benchmarks like ScholarQA-CSv2 (86.7 vs 32.9) and matches proprietary systems like OpenAI Deep Research while costing about $0.00008 per query versus their $1.80.
Tools of the Trade
Git Worktree Runner by CodeRabbit - An open-source CLI wrapper around git worktrees that reduces verbose commands to simple ones, like
gtr new featureand auto-handles the annoying parts (copying .env files, running npm install, opening your editor). It's built for parallel development: review PRs while working on features, or run multiple AI coding agents on different branches simultaneously. Works with all your AI coding agents.Epub2md - Small tool that converts epub books into folders with each chapter as a Markdown file. Makes it easy for CLI agents and LLMs to reference books on demand.
Continuous Claude - Wraps Claude Code in a loop that creates branches, opens PRs, waits for CI/CD checks and reviews, then merges. It repeats this cycle to handle multi-step tasks like test coverage expansion or refactoring, maintaining state across iterations instead of losing progress after each run.
Awesome LLM Apps - A curated collection of LLM apps with RAG, AI Agents, multi-agent teams, MCP, voice agents, and more. The apps use models from OpenAI, Anthropic, Google, and open-source models like DeepSeek, Qwen, and Llama that you can run locally on your computer.
(Now accepting GitHub sponsorships)
Hot Takes
There was never any evidence that pre-training had hit a wall and there continues to be no such evidence
People are crying that Google forked Windsurf?
They paid 2.4 billion for the Windsurf IP, so what is the surprise here?
That’s all for today! See you tomorrow with more such AI-filled content.
Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!
PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉





Reply