unwind ai
Posts
Google's Asynchronous AI Coding Agent

Google's Asynchronous AI Coding Agent

PLUS: Opensource AI Agents SDK by AWS, Web browsing agent by Google

Shubham Saboo & Gargi Gupta
May 21, 2025

Today’s top AI Highlights:

Build and run AI agents in a few lines of code with AWS’s opensource SDK
Google releases asynchronous AI coding agent Jules for free
Google’s web browsing agent Project Mariner now in API
AI Mode is Google’s loudest reply to the Perplexity buzz
New Gemini 2.5 Flash, Deep Think, and Gemini Diffusion

& so much more!

Read time: 3 mins

AI Tutorial

Building tools that truly understand your documents is hard. Most RAG implementations just retrieve similar text chunks without actually reasoning about them, leading to shallow responses. The real solution lies in creating a system that can process documents, search the web when needed, and deliver thoughtful analysis. Moreover, running the pipeline locally would reduce latency and ensure privacy and control over sensitive data.

In this tutorial, we'll build a powerful Local RAG Reasoning Agent that runs entirely on your own machine, with web search fallback when document knowledge is insufficient. You'll be able to choose between multiple state-of-the-art opensource models like Qwen 3, Gemma 3, and DeepSeek R1 to power your system.

This hybrid setup combines document processing, vector search, and web search capabilities to deliver thoughtful, context-aware responses without cloud dependencies.

We share hands-on tutorials like this every week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

Build a Qwen 3 Local RAG Reasoning Agent

Fully functional local agentic RAG app with step-by-step instructions (100% opensource)

Don’t forget to share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads) to support us!

Latest Developments

LLM-driven Approach to Building AI Agents 🤖🧠🧰

AWS has released Strands Agents, an opensource SDK that simplifies building AI agents with just a few lines of code. Unlike other frameworks that require you to define each part of a workflow for your agents, Strands relies on the capabilities of state-of-the-art models to plan, chain thoughts, call tools, and reflect.

Strands plans the agent’s next steps and executes tools using the advanced reasoning capabilities of models. For more complex agent use cases, you can customize your agent’s behavior, like specifying how tools are selected or how context is managed. With Strands, you can simply define a prompt and a list of tools in code to build an agent, then test it locally and deploy it to the cloud.

Key Highlights:

Simple Architecture - Strands takes a lightweight, model-driven approach where you define three components: a model that handles reasoning and planning, tools that execute actions, and a prompt that specifies the task. This streamlined design eliminates complex orchestration code while still handling sophisticated agent tasks.
Broad model and tool support - Work with models from Amazon Bedrock, Anthropic's Claude, Meta's Llama, Ollama for local development, or other providers through LiteLLM. Use 1000+ MCP servers as tools, 20+ pre-built tools, or easily convert any Python function into a tool with a simple decorator.
Advanced patterns - Implement sophisticated multi-agent systems using workflow, graph, and swarm tools. The Retrieval tool can perform semantic search for documents or find relevant tools from large collections, and the Thinking tool enables deep analytical processing when needed.
Observability - Built-in observability using OpenTelemetry helps monitor agent performance in production with metrics, distributed tracing, and complete agent trajectories.
Deployment options - Deploy agents as monoliths or microservices, run them locally or in the cloud, and choose architectures where tools execute in the same environment or in isolated backends. The SDK includes reference implementations for AWS Lambda, Fargate, and EC2 deployments.

Google’s Asynchronous AI Coding Agent 🤖🧑‍💻

Google has released Jules, their autonomous coding agent, in public beta worldwide, with no waitlist and free of cost (while in beta). Jules runs inside a cloud VM for parallel execution. It can handle multiple requests simultaneously, allowing you to focus on other tasks while it handles everything like writing tests, building features, and fixing bugs.

Using Gemini 2.5 Pro, Jules clones your entire codebase to understand the full context of your project before making changes or suggestions. It integrates directly with your GitHub workflow, creating branches and helping you prepare PRs once work is complete.

A side note: We loved the quirky landing page of Jules!

Key Highlights:

Works with real codebases - Jules handles your actual GitHub repositories, not simplified sandboxes. It clones your code to a secure VM, installs dependencies, runs tests, and makes changes while understanding the full context of your project.
Asynchronous workflow - Submit your task with a detailed prompt and move on to other work. Jules operates in parallel, handling multiple requests simultaneously in its cloud environment, then presents a complete plan with reasoning and code diffs when finished.
GitHub integration - Jules imports your repositories, creates branches, shows diffs of proposed changes, and helps you create PRs directly within your existing GitHub workflow.
Visibility and control - Jules shows its plan and reasoning before executing changes, provides real-time activity logs, and allows you to modify the approach before, during, and after execution through interactive feedback.
Free during beta - Jules is currently free to use during the beta period with default limits of 3 concurrent tasks, 5 total tasks per day, and 5 audio codecasts per day.

Google’s Web Browsing AI Agent Working in Parallel 🌐☁

Google previewed Project Mariner last December as an experimental AI agent that browses websites and completes tasks on your behalf using natural language commands, competing directly with Anthropic Computer Use, OpenAI Operator, and Amazon Nova Act.

Project Mariner now runs on cloud-based virtual machines instead of your local browser, allowing you to delegate up to 10 tasks simultaneously, while keeping your computer free for other work. These agents can handle diverse operations, including research, bookings, purchases, and data entry, with the ability to learn and replicate workflows over time. Google has made Project Mariner available to Google AI Ultra subscribers in the US for $250/month (yes, a new subscription tier!), and also released a Computer Use API through the Gemini API and Vertex AI for developers to integrate these capabilities into their applications.

Key Highlights:

Cloud-Based Multitasking - Project Mariner now operates on virtual machines rather than in your local browser, enabling you to delegate up to ten simultaneous tasks like research, bookings, and purchases while keeping your computer free for other work.
Multimodal Understanding - The agent observes and interprets web elements, including text, images, code, and forms to build a comprehensive understanding of websites, then plans actions based on your natural language instructions.
Developer API Access - The Computer Use API gives developers direct access to Mariner's capabilities through the Gemini API (currently for Trusted Testers) and Vertex AI, allowing integration of web browsing functionality into custom applications.
Workflow Learning - After completing a task once, Mariner can replicate the same process in future sessions with minimal direction, making repetitive web tasks more efficient over time as the agent learns your common workflows.

Quick Bites

Google Labs has launched Stitch to vibe code UI designs and frontend code. It’s powered by Gemini 2.5 Pro, which excels in frontend code gen and tops the Web Dev Arena. You can describe your app in plain English or upload wireframes or sketches, and Stitch will generate a matching interface. You can paste designs into Figma, explore multiple layout variants, and export clean code straight from the interface. It’s available for free with generous rate limits.

Perplexity was hyped as a Google Search competitor, but with the latest AI Mode, Google just reminded everyone who’s still the boss. What started as AI Overviews is now expanding into a full-blown AI-first search experience, packed with advanced reasoning, agentic capabilties, and personalized responses.

Google is rolling out a new AI Mode powered by Gemini 2.5, capable of multimodal queries, deeper web searches, live video sharing for Search, agentic web browsing via Project Mariner, data visualization, and personalized recommendations.

Google has rolled out new updates across the Gemini 2.5 model family, bringing performance boosts to 2.5 Flash, a powerful new reasoning mode called Deep Think, major improvements to the Live API, and a new experimental model called Gemini Diffusion.

A new version of Gemini 2.5 Flash comes with improved reasoning, multimodality, code, and long context while getting even more efficient, using 20-30% fewer tokens. Now available for preview in Google AI Studio, Vertex AI, and the Gemini app.
The new Gemini 2.5 Flash model also includes new audio-visual features via the Live API — proactive video where the model can detect and remember key events, proactive audio where the model chooses not to respond to irrelevant audio signals, and affective dialog where the model detects the user’s tone and responds accordingly.
Deep Think is a new reasoning mode for Gemini 2.5 Pro that lets the model run multiple solution paths in parallel before choosing a final answer. Gemini 2.5 Pro with Deep Think ranks highest on LiveCodeBench and beats OpenAI’s o3 on MMMU. For now, Deep Think is only available to select testers.
Gemini Diffusion is a new state-of-the-art diffusion text model that is very fast; generates at 5x the speed of Google’s fastest model Gemini 2.0 Flash, while matching its coding performance. Sign up for waitlist to get access.

There are a bunch of other interesting products that were released. You can check out our thread to know more.

Hot Takes

Your surroundings will shape what you think is good product design vs. what actually is usable.
When you’re building software in San Francisco:
Surrounded by sober Ivy League engineers who go to sleep at 8pm and have a perfectly balanced diet (i.e., human computers)
What America actually is:
• 23% of Americans have some form of mental illness
• 20% of Americans have more than 6 drinks per day
It’s actually surprising that anyone gets through any signup process at all. Try to do the email confirmation flow after 6 drinks. ~
Nikita Bier
our kids will think we were crazy for using Google for 20 years
"so you typed a question, got 100000 blue links gamed by SEO agencies, opened 11 tabs, skimmed each site, pieced together an answer, and repeated this process 20 times a day?" they'll think we were digital cavemen. ~
GREG ISENBERG

That’s all for today! See you tomorrow with more such AI-filled content.

Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!

Unwind AI - X | LinkedIn | Threads

Awesome LLM Apps | Sponsor Us

PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉

Reply

or to participate.