unwind ai
Posts
Connect AI Agents to The Entire Web

Connect AI Agents to The Entire Web

PLUS: Agent Squad Framework by AWS, NVIDIA's open code reasoning model beats o3-mini

Shubham Saboo & Gargi Gupta
May 09, 2025

Today’s top AI Highlights:

AWS’s multi-agent framework with routing, memory, RAG, and tool support
Scrape and query websites like a database using plain English
NVIDIA opensources code reasoning modes outperforming o3-mini
MCP boilerplate for vibe coders
Vibe code fully-functional custom AI agents and tools

& so much more!

Read time: 3 mins

AI Tutorial

While working with web data, we keep facing the challenge of extracting structured information from dynamic, modern websites. Traditional scraping methods often break when coming across JavaScript-heavy interfaces, login requirements, and interactive elements - leading to brittle solutions that require constant maintenance.

In this tutorial, we're building an AI Startup Insight Agent application that uses Firecrawl's FIRE-1 agent for robust web extraction. FIRE-1 is an AI agent that can autonomously perform browser actions - clicking buttons, filling forms, navigating pagination, and interacting with dynamic content - while understanding the semantic context of what it's extracting.

We'll combine this with OpenAI's GPT-4o to create a complete pipeline from data extraction to analysis in a clean Streamlit interface. We’ll use Agno framework to build our AI startup insight agent.

We share hands-on tutorials like this every week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

Build an AI Startup Insight Agent with FIRE-1

Fully functional agent app with step-by-step instructions (100% opensource)

Don’t forget to share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to support us!

Latest Developments

Agent Squad to Build and Orchestrate Multiple Agents 👬👭🧰

AWS’s Agent Squad is an opensource lightweight framework for building and orchestrating multiple AI agents in one application. It handles everything from routing queries to the right agent, maintaining chat context per user and agent, and supporting both streaming and non-streaming responses. It works out of the box with Amazon Bedrock, Lex bots, Lambda functions, OpenAI models, and can be deployed anywhere—from AWS Lambda to your local dev setup.

It’s written in both Python and TypeScript, and ships with clear APIs to plug in your own logic or replace parts of the stack.

Key Highlights:

Agent Coordination - An AI-powered classifier to analyze user queries and route them to the most suitable agent based on the query content and conversation history. When a user asks "What's the weather like?" or follows up with "How about tomorrow?", the system knows exactly which agent should handle it.
SupervisorAgent - A new SupervisorAgent implements an "agent-as-tools" architecture where a lead agent can coordinate a team of specialized agents working in parallel. This enables sophisticated workflows where complex tasks get broken down and distributed to the right specialist agents.
Mix-and-Match Agents - Build systems that combine diverse agent types including Bedrock LLMs, OpenAI models, Amazon Lex bots, Lambda functions, and custom agents. Each agent maintains its own conversation state while the orchestrator handles the complexity of routing and context management across all of them.
Storage and Tooling - Use in-memory, DynamoDB, or SQL for storing conversations. Tools follow a unified format with support for Claude, Bedrock, and OpenAI-style tool calling.
Deployment - Run your multi-agent system anywhere, from AWS Lambda to your local environment. The framework handles both streaming and non-streaming responses, manages concurrent conversations, and includes built-in tools for analyzing agent performance and optimizing configurations.

Connect LLMs and AI agents to the Entire Web 🤖🪏🌐

AgentQL is a new framework built to connect LLMs and AI agents directly with live websites using natural language. Instead of wrestling with traditional scraping methods like XPath or CSS selectors, you simply describe what you're looking for and AgentQL handles the heavy lifting.

AgentQL's semantic understanding of web pages allows your queries to keep working even as sites evolve, letting you write code once and deploy it widely across similar sites without constant maintenance. You can use AgentQL to automate workflows, power agent actions, or scrape structured data from multiple sources at scale.

Key Highlights:

Natural Language Queries - Write intuitive queries using plain English to describe what you're looking for instead of wrestling with XPath or CSS selectors. This makes your code more readable and maintainable while allowing non-technical team members to understand and contribute.
Self-Healing Selectors - AgentQL uses AI to build a semantic understanding of web page contexts, finding elements based on their meaning rather than DOM position. This makes your queries resilient to site updates, A/B tests, and even complete redesigns.
Structured Data - Define the exact shape of your output data within your query, eliminating post-processing steps. For example, a single query like { products[] { name price(integer) description }} extracts precisely formatted product information from any e-commerce site.
SDKs and Integrations - Seamlessly incorporate AgentQL into your tech stack with Python and JavaScript SDKs, plus ready-made integrations with popular frameworks like LangChain, Zapier, and Anthropic's Model Context Protocol.

Quick Bites

Cognition AI has introduced Kevin-32B, a 32B parameter model built for writing CUDA kernels. Built on QwQ 32B, it uses multi-turn reinforcement learning. The model is trained to refine its own code based on runtime feedback and error traces, to improve performance step-by-step, rather than just guessing once. On the KernelBench benchmark, Kevin outperformed both OpenAI's o3 and o4-mini, solving 89% of tasks and achieving higher speedups in the hardest tasks. Available to download from Hugging Face.

Anthropic has added web search to its API, allowing Claude to fetch and analyze live information from across the internet. You can now build Claude-powered apps and agents that reference current data, like market trends, legal updates, or developer docs, without managing any search backend.

Available for Claude 3.7 Sonnet, upgraded 3.5 Sonnet, and 3.5 Haiku models, priced at $10 per 1,000 searches plus token costs. All responses come with source citations, and developers can control domain access and search behavior using simple settings.

Google has launched image generation and editing with Gemini 2.0 Flash in preview, now available via Google AI Studio and Vertex AI. With simple prompts, you can generate new images, edit specific parts, or co-draw in real time, all using the gemini-2.0-flash-preview-image-generation model. The update brings sharper visuals, better text rendering, and fewer filter blocks.

NVIDIA has open-sourced a new family of code reasoning models called Open Code Reasoning (OCR), available in 32B, 14B, and 7B sizes under the Apache 2.0 license. The 32B model outperforms OpenAI’s o3-mini and o1-low on LiveCodeBench on code generation and debugging. They are trained on NVIDIA’s custom OCR dataset and support major inference frameworks out of the box, including vLLM, llama.cpp, Hugging Face Transformers, and TGI.

Tools of the Trade

MCP Boilerplate: It's a starter kit designed to help you quickly create, deploy, and monetize your own remote MCP server. It includes features like Cloudflare deployment, user authentication (Google/GitHub), and Stripe payment integration.
Invent by Relevance AI: Vibe code custom AI agents AND tools by simply describing what you want in plain English, no coding needed. It auto-generates the agent logic and integrates with your existing apps, like email and calendar. Your agent would be up and running in minutes.
CoGenAI: AI inference platform built for agent workflows, offering unlimited access to models for text, code, speech, tool use, and more. Instead of usage-based billing, it runs on flat pricing with optional compute contribution to earn credits.
Awesome LLM Apps: Build awesome LLM apps with RAG, AI agents, MCP, and more to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos, and automate complex work.

Hot Takes

o3 is officially better than both my $1,500/hour outside counsel AND my $300,000/year vp of finance ~
emily is in sf

let me be 10000% clear. i have never met an engineer who has used windsurf. wtf is going on ~
Dylan Stein
me: hey o3, what's the weather tomorrow
o3: reads 246 papers on the physics of weather, derives tomorrow's weather from first principles, responds with a 8x5 table of possible weathers for tomorrow with probabilities and what I should wear for each scenario ~
Nabeel S. Qureshi

That’s all for today! See you tomorrow with more such AI-filled content.

Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!

Unwind AI - X | LinkedIn | Threads | Facebook

Awesome LLM Apps | Sponsor Us

PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉

Reply

or to participate.