- unwind ai
- Posts
- You Aren't the User, AI Agents Are
You Aren't the User, AI Agents Are
PLUS: Opensource alternative to Perplexity Comet, Mistral’s new open coding model
Today’s top AI Highlights:
AI agents are the new internet users. And they call functions.
Opensource alternative to Perplexity Comet
Mistral’s new open coding models
v0 releases text-to-app API
Claude Code with special commands and 9 cognitive personas
& so much more!
Read time: 3 mins
AI Tutorial
Business consulting has always required deep market knowledge, strategic thinking, and the ability to synthesize complex information into actionable recommendations. Today's fast-paced business environment demands even more - real-time insights, data-driven strategies, and rapid response to market changes.
In this tutorial, we'll create a powerful AI business consultant using Google's Agent Development Kit (ADK) combined with Perplexity AI for real-time web research. This consultant will conduct market analysis, assess risks, and generate strategic recommendations backed by current data, all through a clean, interactive web interface.
We share hands-on tutorials like this every week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.
Latest Developments
Browser automation is fundamentally broken. AI agents waste 10-20 seconds per action just trying to figure out where to click on your screen. Every "simple" task becomes an expensive guessing game of screenshot analysis, DOM parsing, and hoping the UI hasn't shifted by a pixel.
The future of browser automation isn't teaching AI to click buttons. It's giving it direct access to the APIs behind those buttons.
MCP-B embeds MCP servers directly into websites, giving AI agents direct access to site functions instead of forcing them to play a visual treasure hunt. Rather than burning tokens on questions like "where's the submit button?" AI can simply call submitForm()
and get instant results. It uses your existing login sessions - no OAuth flows and API key juggling.
For developers, it's 50 lines of code to make any website AI-ready. For users, it's a single Chrome extension that instantly unlocks automation across all MCP-B enabled sites - no configuration, no API keys, just pure productivity.
Key Highlights:
Direct function calls - AI executes tasks through clean APIs in milliseconds instead of spending 10-20 seconds analyzing screenshots and guessing element locations.
Session-based auth - Uses existing browser cookies and login state, completely bypassing OAuth 2.1 flows and API key management that plague today’s workflows using MCP servers.
Context-aware tool exposure - Websites dynamically show relevant functions based on current page and user permissions, preventing AI overwhelm while maintaining security boundaries.
Minimal integration - Developers add MCP capabilities with basic npm install and tool definitions, while users gain instant cross-site automation through one browser extension.
Every major AI product now has an opensource alternative, and browsers are no exception.
While Perplexity Comet locks you into a $200/month subscription and sends your data to their servers, BrowserOS offers agentic browsing that keeps everything local and costs nothing.
This startup has built a Chromium-based browser where AI agents can automate workflows like ordering from your Amazon history, managing tabs, and interacting with web apps - all running locally on your machine through Ollama or your own API keys. Unlike Comet's server-side assistant, BrowserOS lets you actually watch the AI agent click around and automate tasks right in your browser.
Key Highlights:
Local AI Agents - The AI agents run entirely on your machine, not on remote servers. You can watch them click around and perform tasks in real-time, giving you complete visibility into what's happening with your data and automating workflows without sending anything to the cloud.
True Privacy - Unlike Comet's requirement for extensive Google Account permissions, BrowserOS keeps your browsing history, emails, and documents completely local. Built-in Ollama support lets you run powerful local LLMs, and there's no search or ad company collecting your data for profit.
Open Source and Extensible - Built as a Chromium fork under AGPL-3.0 license, BrowserOS feels exactly like Chrome and works with all your existing extensions. No invite systems, no waitlists, no lock-in - download it today and start using it immediately with the familiar interface you already know.
No Subscription Model - While Perplexity Comet costs $200/month for Max subscribers, BrowserOS is completely free with BYOK (bring your own keys) support. The team isn't a search or ad company, so there are no weird incentives to monetize your personal data or browsing habits.
Quick Bites
Mistral just dropped two new coding models that surpass much pricier alternatives like Gemini 2.5 Pro and GPT 4.1. Devstral Small 1.1 hits 53.6% on SWE-Bench Verified, while Devstral Medium crushes both Gemini 2.5 Pro and GPT-4.1 at 25% of their cost with a 61.6% score. Both models excel with agentic scaffolds and support multiple prompting formats, making them particularly attractive for autonomous coding workflows. Devstral Small 1.1 is opensourced under the Apache 2.0 license.
v0 released their Platform API in public beta that lets you programmatically generate, parse, and deploy web apps from text prompts. The API gives you direct access to v0's chat management, automatic code parsing, and browser-based Next.js execution environment. Whether you're building IDE extensions or internal dev tools, you can now pipe natural language directly into running applications.
Microsoft released Phi-4-mini-flash-reasoning, an open 3.8B parameter reasoning model that thinks faster but not lesser. It is optimized for advanced math reasoning with much higher throughput and lower latency than its peers. It supports a 64K token context length and is fine-tuned on high-quality synthetic data to deliver reliable, logic-intensive performance. Great for on-device AI in cases like math tutoring or edge-based logic agents.
Reka AI dropped an AI research agent that does multi-hop reasoning across web and private documents for $0.025 per query - no token counting gymnastics required. Reka Research iterates through up to a dozen sources and outputs structured JSON that your downstream code can rely on. Gives best-in-class performance by surpassing GPT-4o, Claude 4 Sonnet, and Gemini in SimpleQA and Research-Eval.
This company has been building some impressive models but remains under the radar of the mainstream audience.
Genspark continues to ship relentlessly! They have released Genspark AI Pods, a fully-agentic tool that can generate engaging, high-quality podcasts from any topic, webpage, YouTube video, or document with just one simple prompt. One prompt can handle content analysis, research, audio production, and host generation. Do check out the demos — is it just us or are the voices actually very similar to Google NotebookLM’s hosts?
Claude Code GitHub Actions are now available for Pro and Max Plan users. Just mention @claude
in any PR or issue, Claude will be able to analyze your code, create pull requests, implement features, and fix bugs - all while following your project’s standards. Along with this, the team has also released a new native installer for Claude Code using the Bun runtime that is faster to set up and independent of npm or node.
Tools of the Trade
SuperClaude: Extends Claude Code with 19 specialized commands and 9 cognitive personas to Claude Code, along with git-based checkpoints, automated documentation, and optimized token usage. Runs 100% local.
Grok 4 Fire Enrich: An opensource contact enrichment engine. Turn a simple list of emails into a rich dataset with company profiles, funding data, tech stacks, and more. Powered by Firecrawl, a multi-agent AI system, and Grok 4 for intelligent agent execution.
CamelAI: Embed an AI data analyst in your SaaS that connects directly to your database, answering your users’ ad-hoc questions with interactive visualizations and actionable insights, all without requiring backend development work.
Awesome LLM Apps: Build awesome LLM apps with RAG, AI agents, MCP, and more to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos, and automate complex work.
Hot Takes
By the way, you can basically make the "Grok heavy" version of any model by having multiple agents running tools in parallel, then checking notes together and deciding which one is the best answer.
I may release an open source project for that. ~
Pietro Schiranothings that people in silicon valley like talking about
- tbpn
- cluely
- high agency
- generative
- pre-money valuation
- AI
- ESOP
- context windows
- multimodal models
- design engineers
- components
- primitives
- "tahoe" ~
Greg Isenberg
That’s all for today! See you tomorrow with more such AI-filled content.
Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!
PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉
Reply