- unwind ai
- Posts
- OpenAI's New Agentic Browser
OpenAI's New Agentic Browser
+ AI Agents with MCPs in 5 lines of code
Today’s top AI Highlights:
& so much more!
Read time: 3 mins
AI Tutorial
Imagine uploading a photo of your outdated kitchen and instantly getting a photorealistic rendering of what it could look like after renovation, complete with budget breakdowns, timelines, and contractor recommendations. That's exactly what we're building today.
In this tutorial, you'll create a sophisticated multi-agent home renovation planner using Google's Agent Development Kit (ADK) and Gemini 2.5 Flash Image (aka Nano Banana).
It analyzes photos of your current space, understands your style preferences from inspiration images, and generates stunning visualizations of your renovated room while keeping your budget in mind.
We share hands-on tutorials like this every week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.
Latest Developments
OpenAI just launched Atlas, a Chromium-based browser with ChatGPT baked directly into every tab, sidebar, and search bar you touch.
Perplexity's Comet focuses boldly on automating your search workflows and synthesizing search results, while Atlas wants to deliver search experiences that are very personal to you. It uses your ChatGPT account and what it has learned so far about you to surface that information, along with your queries in search.
The browser tabs open with a familiar ChatGPT-style interface where you can simply ask questions or put in your search query. It summarizes the search results for you along with citations, or you can explore on your own in the next tab, just like any other browser.
Atlas includes an "Ask ChatGPT" button on every page that opens a companion sidebar, eliminating the constant copy-pasting between tabs and apps. Atlas also introduces browser memories, an optional feature that logs the sites you visit and uses that context to personalize responses over time.
Available now on macOS (Windows, iOS, and Android coming soon), the browser is free for all users, though Agent Mode, the real workhorse feature, is currently limited to Plus, Pro, and Business subscribers.
Key Highlights:
Agent Mode for Task Automation - ChatGPT can autonomously book reservations, order groceries from recipes, compile research into team briefs, and fill out forms while you watch (or walk away). Though, just like other computer-use AI agents, early user feedback says it is painstakingly slow!
Browser Memory System - Atlas remembers sites you've visited and can retrieve them on command with prompts like "find that travel doc I looked at last week." You control visibility per site via an address bar toggle, and clearing your history wipes all associated memories.
Contextual Sidebar Integration - The ChatGPT sidebar stays open across your browsing session and automatically sees what's on your screen. Ask it to summarize pages, compare products, or answer questions about content without switching windows or providing additional context.
Guardrails for Agent Mode - Atlas includes specific safeguards for sensitive actions: it can't access your file system, won't save agent activity to browsing history, and allows you to run agent mode in logged-out mode where it won't use existing cookies or access your accounts without explicit approval.
Turn AI Into Your Income Stream
The AI economy is booming, and smart entrepreneurs are already profiting. Subscribe to Mindstream and get instant access to 200+ proven strategies to monetize AI tools like ChatGPT, Midjourney, and more. From content creation to automation services, discover actionable ways to build your AI-powered income. No coding required, just practical strategies that work.
Five lines of code. That's all it takes to build a production-ready AI agent with Dedalus Labs' new open-source SDK and MCP gateway. The team has unified what used to be a fragmented mess of model APIs, tool integrations, and deployment configs into a single drop-in API endpoint. You pick your model (GPT-5, Claude Opus 4.1, Gemini 2.5 Flash, Qwen-Max, or any other), grab tools from their hosted MCP marketplace, mix in your local Python functions if needed, and deploy.
The SDK handles vendor-agnostic model handoffs, routing between local and cloud-hosted tools, and streaming across any provider. Their cloud infrastructure lets you spin up an MCP server in three clicks, so sharing tools with the community doesn't mean wrestling with Dockerfiles and YAML configs anymore.
Key Highlights:
Five-Line Agent Deployment - Build and deploy complex multi-tool agents in five lines of code. The SDK abstracts away model APIs, tool orchestration, and streaming logic into a single unified interface.
Non-Linear Agent Execution - Agents can branch, retry, pause for context, and make independent decisions about when to stop. The SDK supports asynchronous execution and handles state management for non-deterministic workflows.
Marketplace Monetization - List your agent or MCP server in their marketplace and earn revenue every time it's called. Creators get 80% of the share with instant payouts.
Open-Source Templates - Get TypeScript and Python templates for building MCP servers. The templates include error handling, logging, type safety, and streamable HTTP transport out of the box.
Quick Bites
Vibe code full-stack Shopify stores within Lovable
Lovable just launched a Shopify integration that lets you build a fully functional online store through conversation, no manual setup required. The AI handles everything from checkout infrastructure to product listings and descriptions, then hands you a publishable Shopify store ready to go live. It's live now with a 30-day free Shopify trial for new users.
New vibe coding experience in Google AI Studio, 100% free
Google AI Studio now has a new vibe coding experience in its Build tab. It’s a browser-based environment where you can build Gemini-powered apps entirely for free. Give it a prompt, select features you want to add to it, or just hit "Feeling Lucky" to get some inspiration, and it generates full React/TypeScript projects that you can edit via chat and fork. Apps currently run in your browser in a sandboxed iframe. There’s no server-side component. To run the app, you can consider using Google Cloud Run. It’s still in very early stages and will keep growing.
#1 Computer-use AI agent for desktop, web, and mobile
This Computer-use AI agent just swept all four major computer-use benchmarks - OSWorld, WebArena, WebVoyager, and AndroidWorld, beating human baselines on desktop and mobile. H Company released Surfer 2 with an interesting architecture choice - splitting planning from execution using an orchestrator that coordinates specialized sub-agents, and reviews its progress towards its task before determining the next action. The company admits that running it costs a fortune. Though they're clear that their next model is in the works, specifically to cut inference costs and make this viable at scale. The agent is available to try on their playground.
Tools of the Trade
Open Agent Builder - Open-source n8n-style visual AI agent workflow builder with drag-and-drop nodes, with native integration with Firecrawl. Build AI agents with MCP servers, multiple LLMs (Claude, GPT-5, Groq), and deploy as an API.
cto.new - A completely free AI coding agent that can handle coding, planning, task management, code review, and deployment, and integrates with tools like GitHub, Linear, Jira, and Slack. You get unlimited access to multiple frontier AI models (GPT-5 Codex, Claude Sonnet 4.5, Gemini Pro). We don’t know what the catch here is, but it’s probably to trade in your usage data.
WebMCP - Allows websites to function as MCP servers, exposing their tools and resources to MCP clients. No sharing API Keys. Use any model you want. It comes in the form of a widget that a website owner can put on their site and expose tools to give client-side LLMs what they need to provide a great UX for the user or agent.
Awesome LLM Apps - A curated collection of LLM apps with RAG, AI Agents, multi-agent teams, MCP, voice agents, and more. The apps use models from OpenAI, Anthropic, Google, and open-source models like DeepSeek, Qwen, and Llama that you can run locally on your computer.
(Now accepting GitHub sponsorships)
Hot Takes
google basically open sourced its own obsolescence twice in some sense… first by open sourcing chromium, then by publishing “attention is all you need.” both were meant as flexes of dominance, but ended up being open source trojan horses for competitors.
now every google killer runs on google dna. it’s like the empire standardized the roads & forgot anyone else could march on them too.
~ signüll
Prompting and evals will be everything
Prompting is agency
Evals is taste
That’s all for today! See you tomorrow with more such AI-filled content.
Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!
PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉
Reply