Context Mode for Claude Code

+ Run LLMs at home, use them everywhere

Today’s top AI Highlights:

& so much more!

Read time: 3 mins

AI Tutorial

Six AI agents run my entire life while I sleep.

Not a demo. Not a weekend project.

A real team that works 24/7, making sure I'm never behind. Research done. Content drafted. Code reviewed. Newsletter ready. By the time I open Telegram in the morning, they've already put in a full shift.

By the end of this, you will understand exactly how to build an autonomous AI agent team that runs while you sleep.

We share hands-on tutorials like this every week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

Don’t forget to share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads) to support us!

Latest Developments

Your AI agent's context window has two enemies: the tool definitions going in, and the raw data coming out.

Cloudflare's Code Mode showed us how to compress thousands of MCP tool definitions down to near-nothing, but that fixed only half the problem.

Now, Context Mode, an open-source MCP server, tackles the other half. Every Playwright snapshot, GitHub issue list, or access log that an MCP tool returns eats into your agent's working memory.

Context Mode sits between your agent and these tool outputs, routing raw data through isolated subprocesses so only compact, essential results enter the conversation.

It shrank 315 KB of raw MCP output to 5.4 KB; that’s a 98% reduction!

Key Highlights:

  1. 98% reduction on outputs - Context Mode compresses raw tool responses (Playwright snapshots, GitHub issues, logs) from 315 KB down to 5.4 KB across a full session, extending usable session time from ~30 minutes to ~3 hours.

  2. 10 runtime languages supported — The sandbox runs JavaScript, TypeScript, Python, Shell, Ruby, Go, Rust, PHP, Perl, and R, with Bun auto-detected for faster JS/TS execution.

  3. Built-in knowledge base — A fetch_and_index tool lets you index URLs or docs into a SQLite FTS5 store with BM25 ranking and Porter stemming, so raw pages never enter context either.

  4. Drop-in installation — Install via the Claude Code plugin marketplace or a single npx command, with a PreToolUse hook that auto-routes tool outputs through the sandbox without changing your workflow.

Last December, 150,000+ developers joined the first season of Advent of Agents, building their first agent to deploying it in production.

Season 2: Spring Edition officially kicks off this Sunday, March 1st!

It follows the same "Always Kata" rule: actionable skills, zero fluff, deployed in under 5 minutes. Every day for the next 31 days, it’ll unlock a new piece of the puzzle to move you from "Hello World" to deploying multi-agent teams in production.

🗓️ What's on the menu for March:

Week 1: Foundations & The Stack – Choose your language (Python, Go, Java, or TypeScript) and master building AI Agents with the ADK.

Week 2: Multi-Agent Design Patterns – Move beyond single agents. Connect tools, give memory to your agents, and learn about multi-agent patterns with real-world examples.

Week 3: Advanced Capabilities – Dive into real-time voice with the Live API, self-healing "Reflect & Retry" agents, and time-traveling state management.

Week 4: Enterprise Production – Harden the stack with Model Armor and CI-grade evals, and make your agents "unkillable" with durable execution.

👉 Bookmark the Hub: adventofagents.com

Let’s build something amazing this spring!

Most coding agents ship with a hundred features you didn't ask for and zero ways to build the ones you actually need.

Pi, an open-source terminal coding agent, ships with just four tools (read, write, edit, bash) and a ~300-word system prompt, then hands you the keys to build everything else yourself.

The idea is simple: instead of baking in sub-agents, plan modes, MCP support, and permission popups, Pi gives you TypeScript extensions, skills, prompt templates, and a package system to construct exactly the workflow you want.

It supports 15+ LLM providers out of the box, lets you switch models mid-session, and some more features. The project already has integrations like OpenClaw (a Slack/Telegram bot built on Pi's SDK). It even runs Doom inside the terminal, because of course it does.

Key Highlights:

  1. Aggressively Extensible — Extensions are full TypeScript modules with access to tools, commands, keyboard shortcuts, events, and the TUI — letting you build sub-agents, plan modes, permission gates, SSH execution, or anything else you need without forking the core.

  2. Tree-Structured Sessions — Sessions are stored as trees, so you can branch off to fix something, then rewind back to your main line of work without wasting context.

  3. Context Engineering Built In — AGENTS.md for project instructions, SYSTEM.md for custom system prompts, on-demand skills for progressive disclosure, and extension hooks that let you inject messages, filter history, or implement RAG before every turn.

  4. Package Ecosystem — Bundle and share extensions, skills, prompts, and themes as installable packages via npm or git — install with pi install, pin versions, and discover community packages on npm or Discord.

Quick Bites

Switch to Claude with your Baggage in one prompt
Anthropic just made it dead simple to switch to Claude from other AI providers. You don’t need to use one LLM for that specific chat! Copy a prompt into your current AI to export your preferences and context, paste the results into Claude's memory settings, and you're good to go. Works across all paid plans.

Nano Banana 2 brings Pro-level image gen to Flash
Google just dropped Nano Banana 2 (aka Gemini 3.1 Flash Image), essentially packing Nano Banana Pro's smarts into Flash-level speed. You get the advanced world knowledge, subject consistency across up to 5 characters, text rendering, 4K output, and image-search grounding, but way faster. It's rolling out across the Gemini app, Search AI Mode, AI Studio, Vertex AI, Flow, and Google Ads.

Survival of the fittest, but for code
New open-source drop from Imbue, Darwinian Evolver. It’s a tool that uses LLMs to evolve and optimize code the way nature evolves organisms, but with targeted mutations instead of random ones. It maintains a population of code candidates, scores their fitness, and iterates through intelligent mutations to improve performance end-to-end, and they've already used it to more than double a model's reasoning performance on ARC-AGI tasks. The repo is live on GitHub if you want to plug in your own optimization problems.

Open-weight Moonshine STT models beat Whisper Large 3
Moonshine AI, a six-person startup, just released open-weight Moonshine Voice STT models built for real-time, on-device use across Python, iOS, Android, Raspberry Pi, and more. Their 200M-parameter streaming model beats Whisper Large v3 on word-error rate (with 7.5x fewer parameters). Supports English, Spanish, Mandarin, Japanese, Korean, and several other languages, with built-in voice activity detection, speaker diarization, and a neat intent recognition feature for voice commands.

Tailscale + LMStudio: Run LLMs at home, use them everywhere
Tailscale and LM Studio just teamed up on LM Link, which lets you access open-weight LLMs running on your beefy home GPU from any of your other devices, as if the models were local. Everything flows over end-to-end encrypted connections via Tailscale's network, with zero public internet exposure. Great for tinkerers with a powerful rig at home!

Free 6-month Claude Max for open-source maintainers
Anthropic just launched "Claude for Open Source," a program giving open-source maintainers and contributors 6 months of free Claude Max (20x tier). If you maintain a public repo with 5,000+ GitHub stars or 1M+ monthly NPM downloads, you're eligible to apply. Though they say if you maintain something the ecosystem quietly depends on, apply anyway. Up to 10,000 spots available.

Tools of the Trade

  1. Agent Relay - A real-time agent-to-agent communication SDK that lets AI agents from different providers (Claude, Codex, etc.) talk to each other as peers. It uses a file-based messaging protocol so agents can collaborate, negotiate, and iterate on tasks.

  2. Mission Control - Open-source Next.js dashboard for managing fleets of AI agents, tracking their tasks, sessions, token costs, and lifecycle from a single interface. It includes a Kanban task board, real-time monitoring, RBAC, and webhook integrations.

  3. Hugging Face Skills - A collection of standardized instruction packages that teach coding agents how to perform HF-specific tasks like training models, creating datasets, running evaluations, and publishing papers. Each skill is a self-contained folder with a SKILL.md file that the agent loads on demand.

  4. Awesome LLM Apps - A curated collection of LLM apps with RAG, AI Agents, multi-agent teams, MCP, voice agents, and more. The apps use models from OpenAI, Anthropic, Google, and open-source models like DeepSeek, Qwen, and Llama that you can run locally on your computer.
    (Now accepting GitHub sponsorships)

Hot Takes

  1. The MCP hype is over. CLI Is All You Need

    ~ Marco Franzon

  2. Amjad showed me Replit's latest stuff. They're about to redefine vibe coding in a way that will seem obvious in retrospect. A lot of the biggest ideas have that quality.

    ~ Paul Graham

That’s all for today! See you tomorrow with more such AI-filled content.

Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!

Unwind AI - X | LinkedIn | Threads

PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉 

Reply

or to participate.