unwind ai
Posts
You've Been Using MCP Wrong

You've Been Using MCP Wrong

+ Cursor's free 1-hour course on AI fundamentals, Open-source RAG-Anything framework

Shubham Saboo & Gargi Gupta
September 29, 2025

Today’s top AI Highlights:

& so much more!

Read time: 3 mins

AI Tutorial

Learn OpenAI Agents SDK from zero to production-ready!

We have created a comprehensive crash course that takes you through 11 hands-on tutorials covering everything from basic agent creation to advanced multi-agent workflows using OpenAI Agents SDK.

What you'll learn and build:

Starter agents with structured outputs using Pydantic
Tool-integrated agents with custom functions and built-in capabilities
Multi-agent systems with handoffs and delegation
Production-ready agents with tracing, guardrails, and sessions
Voice agents with real-time conversation capabilities

Each tutorial includes working code, interactive web interfaces, and real-world examples.

The course covers the complete agent development lifecycle: orchestration, tool integration, memory management, and deployment strategies.

Everything is 100% open-source.

We share hands-on tutorials like this every week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

OpenAI Agents SDK Crash Course

Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini, and open-source models. - Shubhamsaboo/awesome-llm-apps

Don’t forget to share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads) to support us!

Latest Developments

We've All Been Using MCP Wrong This Whole Time 🧰🤖

Cloudflare says this is the better way of using MCP: Convert the MCP tools into a TypeScript API, and then ask an LLM to write code that calls that API.

Wait, what? What’s the point of the MCP layer at all?

It makes complete sense! Here’s why:

LLMs have trained on millions of lines of real-world TypeScript code, but they've only seen synthetic, contrived examples of tool calls during training. In real world, this translates to mediocre results. Agents struggle with multiple tools, fumble complex workflows, and waste enormous amounts of tokens bouncing results back and forth between each tool call.

Meanwhile, these same LLMs write beautiful, complex code without breaking a sweat.

Cloudflare's Code Mode solves this by keeping MCP's best feature - standardized connectivity and authorization - while ditching its worst limitation. Instead of presenting MCP tools directly to the LLM, Code Mode converts them into TypeScript APIs with full documentation and type definitions. The agent then writes actual code to accomplish tasks, calling these APIs naturally within the execution flow.

When an agent needs to perform multi-step operations, it writes a script that handles the entire workflow in one go, only returning final results instead of passing intermediate outputs through the LLM after every single tool call.

Here's How This Works:

Schema-to-TypeScript - The Agents SDK fetches the MCP server's schema and automatically generates TypeScript interfaces with proper type definitions and documentation comments for every available function.
Single Tool Replacement - Instead of exposing dozens of individual MCP tools to the LLM, Code Mode presents just one tool that executes arbitrary TypeScript code within a secure sandbox environment.
Dynamic Isolate Creation - Each code snippet runs in a fresh V8 isolate created on-demand using Cloudflare's new Worker Loader API, eliminating the need for container overhead or isolate pooling strategies.
Binding-Based Resource Access - The sandbox receives pre-authorized JavaScript bindings to MCP servers through the environment object, allowing API calls without exposing tokens or requiring network-level filtering.

Open-source All-in-One Multimodal RAG Framework 📖🔍📊

Building RAG systems that work with real-world documents means dealing with more than just text - you need to handle charts, tables, equations, and images that traditional RAG simply ignores or just can’t process effectively.

But this open-source all-in-one multimodal RAG framework does.

RAG-Anything lets you seamlessly process and query documents containing interleaved text, visual diagrams, structured tables, and mathematical formulations through one cohesive interface.

Built on LightRAG, this framework handles everything from PDF research papers with mathematical formulas to PowerPoint presentations with embedded charts. It automatically categorizes content types, routes them through specialized processors, and builds a multimodal knowledge graph that preserves relationships between text, visuals, and structured data.

Key Highlights:

Multimodal Knowledge Graph - Automatically extracts entities from text, images, tables, and equations, then maps semantic relationships between different content types while preserving original document structure and hierarchy.
Specialized Processors - Dedicated analyzers for visual content, structured data interpretation, mathematical expression parsing, and extensible handlers for custom content types through a configurable plugin architecture.
Hybrid Retrieval - Combines vector similarity search with graph traversal algorithms, implementing modality-aware ranking that adjusts results based on content type relevance while maintaining relational coherence.
Three Query Modes - Pure text queries for basic knowledge base search, VLM-enhanced queries that automatically analyze images in retrieved context, and multimodal queries with specific multimodal content analysis.

Build better audiences in minutes, not weeks.

Speedeon's AudienceMaker gives you instant access to 1000+ data points to build and deploy audiences across 190+ platforms like Meta, Google, TikTok, and Amazon.

No data team required. Pay as you go. Request a demo and get a free customer analysis, so you can find more just like them!

Request a demo

Quick Bites

Cursor drops 1-hour AI fundamentals crash course, 100% free
Cursor has launched Cursor Learn, a free educational platform, starting with a six-part video series that breaks down AI fundamentals that you can watch in ~1 hour. The course covers essential concepts like tokens, context, hallucinations, pricing models, tool calling, and agents - all designed for beginners who want to understand how AI actually works. Alongside the videos, you’ll find quizzes and interactive model demos to test what you’ve learned.

#1 coding agent on Terminal-Bench with every model
Factory AI's Droid agent just claimed the top spot on Terminal-Bench with a 58.75% score, outperforming every other coding agent, including Claude Code and Codex CLI. The key insight runs contrary to popular opinion: agent design matters more than model choice. Droid achieves state-of-the-art performance across all frontier models - Claude Sonnet 4, Opus 4.1, and GPT-5, by using hierarchical prompting, model-specific optimizations, and minimalist tool design. Factory now holds 3 of the top 5 leaderboard positions.

Tiny vision model beats Claude, GPT-5, and Gemini Pro at visual reasoning
This vision model with just 2B active parameters beats Claude Opus 4.1, GPT-5, and Gemini 2.5 Pro at visual reasoning - and doing it orders of magnitude faster! Moondream 3 is a 9B MoE model with just 2B active parameters that achieves frontier-level visual reasoning while still retaining blazingly fast inference. It comes with native object detection and a dramatically improved OCR performance - all trained with reinforcement learning that proved so effective it consumed more compute than the initial pretraining. You can try it in the playground or pull weights on Hugging Face.

Cloudflare just gave AI agents native email powers
Cloudflare has released a private beta of its Email Service, allowing developers to send and receive emails directly through Workers. It takes care of the core setup and deliverability checks so you can focus on building. For AI agent apps, this means you can plug in transactional or reply-based email flows directly inside your agent logic without juggling third-party APIs. One platform handles both sending and receiving, cutting out a big chunk of integration overhead.

Tools of the Trade

oLLM - A lightweight Python library for large-context LLM inference, built on top of Huggingface Transformers and PyTorch. It enables running models like gpt-oss-20B, Qwen3-Next-80, or Llama-3.1-8B on 100k context using ~$200 consumer GPU with 8GB VRAM. No quantization is used - only fp16/bf16 precision.
vb.lk (aka Vibe Linking) - a URL shortener that uses a lightweight language model (e.g. Gemini Flash) to turn natural-language phrases into redirects to websites it deems most relevant (often via “I’m feeling lucky”-style search).
Agentic Document Extraction - Python library that extracts structured data from complex documents like PDFs, images, and URLs, converting visual elements like tables, charts, and forms into hierarchical JSON with precise location coordinates. It can easily handle large documents of any length.
Awesome LLM Apps - A curated collection of LLM apps with RAG, AI Agents, multi-agent teams, MCP, voice agents, and more. The apps use models from OpenAI, Anthropic, Google, and open-source models like DeepSeek, Qwen, and Llama that you can run locally on your computer.
(Now accepting GitHub sponsorships)

Hot Takes

Yes, ai will kill your coding job. But I have good news: there’s a ton of new “vibe code cleanup specialist” roles opening up ~
Craig Weiss
If you hide the system prompt and tool descriptions for your LLM agent, what you're actually doing is taking the single most detailed set of documentation for your service and deliberately hiding it from your most sophisticated users! ~
Simon Willison

That’s all for today! See you tomorrow with more such AI-filled content.

Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!

Unwind AI - X | LinkedIn | Threads

Awesome LLM Apps | Sponsor Us

PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉

Reply

or to participate.