unwind ai
Posts
China's Deep Researcher Outperforms OpenAI Deep Research

China's Deep Researcher Outperforms OpenAI Deep Research

PLUS: Specialized scalable RAG agents, Eleven Labs voice AI agent that takes actions

Shubham Saboo & Gargi Gupta
June 24, 2025

In partnership with

Today’s top AI Highlights:

Build specialized RAG agents that can plan, reason, and use tools
China's Kimi-Researcher outperforms $200 OpenAI Deep Research
Google opensources real-time music generation model running on free Colab TPUs
Eleven Labs released a free Jarvis-like agent with MCP
Opensource alternative to Cursor background agents /Jules/Codex

& so much more!

Read time: 3 mins

AI Tutorial

Legal document analysis is a fascinating as well as a complex domain where a team of legal experts traditionally work together to understand and interpret complex legal materials. Each team member brings their unique specialization - from contract analysts who dissect terms and conditions to strategists who develop comprehensive legal approaches.

But what if we could replicate this collaborative expertise using AI? By having multiple AI agents working together as a coordinated legal team, where just like their human counterparts, each agent specializes in a specific area of legal analysis.

In this tutorial, we'll bring this vision to life by creating a multi-agent AI legal team using OpenAI's GPT-4o, Agno, and Qdrant vector database. You'll build an AI application that mirrors a full-service legal team, where specialized AI agents collaborate just like their human counterparts - researching legal documents, analyzing contracts, and developing legal strategies - all working in concert to provide comprehensive legal insights.

We share hands-on tutorials like this every week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

Build an AI Legal Team run by AI Agents

Fully functional multi-agents app using GPT-4o built with Phidata (step-by-step instructions)

Don’t forget to share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads) to support us!

Latest Developments

Specialized RAG Agents with SOTA Accuracy Out of the Box 🔎🎯

RAG systems that just retrieve and generate are so 2023 - what developers actually need are RAG agents that can reason, plan, and make intelligent decisions about how to tackle complex knowledge tasks.

Contextual AI gives a complete platform for building specialized RAG agents that go way beyond simple question-answering to handle multi-step reasoning, SQL query generation, and complex enterprise workflows.

Their RAG 2.0 architecture jointly optimizes all components as a unified system, while their agents can actively decide whether to provide standard answers, decline when information isn't available, or execute structured queries when working with databases.

Key Highlights:

Agentic pipeline - RAG agents that can plan multi-step approaches, make intelligent decisions about retrieval strategies, and can generate SQL queries or decline to respond based on available information rather than just retrieving and generating.
Knowledge-intensive work - Built specifically for complex enterprise tasks like technical research and investment analysis where general-purpose agents typically fail, with tuning tools to adapt agents for domain-specific workflows.
Advanced reasoning - Agents perform multihop retrieval and test-time reasoning, iteratively searching for additional context while reasoning over multimodal content including text, images, charts, and structured data.
Enterprise-ready - Enterprise security controls, precise source attribution with tight bounding boxes, continuous data ingestion pipelines, and flexible deployment options designed for scaling specialized agents in regulated environments.

Stop Asking AI Questions, and Start Building Personal AI Software.

Feeling overwhelmed by AI options or stuck on basic prompts? The AI Fast Track is your 5-day roadmap to solving problems faster with next-level artificial intelligence.

This free email course cuts through the noise with practical knowledge and real-world examples delivered daily. You'll go from learning essential foundations to writing effective prompts, building powerful Artifacts, creating a personal AI assistant, and developing working software—all without coding.

Join thousands who've transformed their workflows and future-proofed their AI skills in just one week.

Get the course

Agentic Deep Research Runs 200+ Searches So You Don’t Have To 🕵️‍♀️ 🌐 📑

China just proved that copying homework isn't always the best strategy in AI. While Google, OpenAI, and Perplexity rushed identical "Deep Research" products to market,

Moonshot AI built something different: Kimi-Researcher, an agentic model that thinks through 23 reasoning steps and explores 200+ URLs per task.

The technical approach couldn't be more different - pure end-to-end reinforcement learning instead of prompt-engineered workflows.

The results speak for themselves: 26.9% on Humanity's Last Exam, beating OpenAI's Deep Research and matching Google's best efforts, all while demonstrating genuine emergent behaviors like cross-referencing conflicting sources and iterative hypothesis refinement.

Key Highlights:

Thinking model with tools - Kimi-Researcher is an agentic thinking model that can do multi-step planning, reasoning, and tool use. It uses 3 main tools: a parallel, real-time internal search tool; a text-based browser tool for web tasks; and a coding tool for code execution.
Pure reinforcement learning - Kimi-Researcher was trained entirely through end-to-end agentic reinforcement learning. Starting from just 8.6% accuracy on Humanity's Last Exam, it reached 26.9% purely through trial-and-error learning, with the agent exploring strategies and receiving rewards for correct solutions across the full trajectory.
Deep research in action - The agent averages 23 reasoning steps per task and explores over 200 URLs, demonstrating emergent behaviors like resolving conflicting information across multiple sources and cross-referencing different versions of texts.
Open source soon - Moonshot AI plans to opensource both the base pretrained model and the reinforcement learning-trained version in the coming months. Beta access is rolling out at kimi.com; get on the waitlist here.

Quick Bites

Nobody saw this coming - Google just released their first open-weight real-time music generation model, letting anyone generate AI music in real-time on free Colab TPUs. Magenta RealTime is an 800M parameter model that generates music in 2-second chunks with 1.6x real-time speed, allowing players to morph styles and instruments on the fly through controllable embeddings.

Do watch the demo to understand how beautifully the model adapts to changes in real-time.

What if you could get vLLM's performance in just 1,200 lines of readable Python code? DeepSeek built nano-vLLM as a personal project that does exactly that. This lightweight implementation ditches the complexity of production frameworks but keeps all the performance optimizations that matter—prefix caching, tensor parallelism, and CUDA graphs. It's designed for researchers who want to understand how modern inference engines actually work without the usual bloat.

This isn’t your average voice assistant - it’s more like having Jarvis at your desk. Eleven Labs just released 11ai, a voice-first AI agent that connects to your daily tools through MCP integration, letting you research prospects with Perplexity, manage Linear tickets, and update Notion - all through natural voice commands. You can choose from 5,000+ voices and connect custom MCP servers to build workflows that actually get things done. It’s free for a few weeks!

Tools of the Trade

EchoStream: Local AI agent that runs directly on your iPhone. It combines web reading, OCR, audio transcription, news curation, and memory search capabilities through a built-in thinking model that connects new information with your stored data.
Cairn: Simple opensource background-agent system. Think Codex, Jules, or Cursor Background Agents. You can run Cairn locally, connect it to your repos, use your favorite LLM, and execute full-stack tasks, completely in the background.
Octocode: Code indexer that builds semantic knowledge graphs of codebases for deep code understanding. It offers features like code relationships mapping, smart commits, and integration with development tools via MCP server.
Awesome LLM Apps: Build awesome LLM apps with RAG, AI agents, MCP, and more to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos, and automate complex work.

Hot Takes

sf is twitter
new york is linkedin
seattle is bluesky ~
Vikhyat K
why tf does a gpt wrapper need millions in funding? your entire tech stack is three API calls and cursor writes all your code ~
Amogh

That’s all for today! See you tomorrow with more such AI-filled content.

Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!

Unwind AI - X | LinkedIn | Threads

Awesome LLM Apps | Sponsor Us

PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉

Reply

or to participate.