• unwind ai
  • Posts
  • Background Agents with OpenAI, Anthropic, Qwen, and Kimi models

Background Agents with OpenAI, Anthropic, Qwen, and Kimi models

PLUS: Alibaba's open-source multi-agent framework, Google's embedding model for on-device RAG

In partnership with

Today’s top AI Highlights:

& so much more!

Read time: 3 mins

AI Tutorial

We have created a complete Google Agent Development Kit crash course with 9 comprehensive tutorials!

This tutorial series takes you from zero to hero in building AI agents with Google's Agent Development Kit.

What's covered:

  • Starter Agent - Your first ADK agent with basic workflow

  • Model Agnostic - OpenAI and Anthropic integration patterns

  • Structured Output - Type-safe responses with Pydantic schemas

  • Tool Integration - Built-in tools, custom functions, LangChain, CrewAI, MCP

  • Memory Systems - Session management with in-memory and SQLite storage

  • Callbacks & Monitoring - Agent lifecycle, LLM interactions, tool execution tracking

  • Plugins - Cross-cutting concerns and global callback management

  • Multi-Agent Patterns - Sequential, loop, and parallel agent orchestration

Each tutorial includes explanations, working code examples, and step-by-step instructions.

Everything is 100% open-source.

We share hands-on tutorials like this every week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

Don’t forget to share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads) to support us!

Latest Developments

An AI software engineer just raced three human developers to ship code and held its own.

It even added a Subway Surfers mini-game to its own UI while fixing bugs for the team.

This is Capy.ai, an autonomous software engineer that ships dozens of features in parallel. It works end-to-end - autonomously triaging issues, executing code in isolated VMs, and pushing PRs to GitHub.

Capy spins up isolated, secure virtual machines for each task, allowing it to work on multiple issues simultaneously. Instead of just diving into the code, it first analyzes the codebase and asks clarifying questions to scope the work. This intelligent planning allows it to create highly specific, actionable tasks for its main coding agent to execute for a more methodical approach to development.

Key Highlights:

  1. Parallel Task Execution - Capy handles multiple coding tasks at the same time. It runs each task in a secure, isolated Virtual Machine with its own dedicated development server and setup instructions.

  2. Intelligent Triage Agent - The platform uses a "Triage" agent that doesn't write code but instead plans the work. It reads your repository and asks clarifying questions to create well-defined tasks for the coding agent.

  3. Ground-Up Infrastructure - The team built its own system of agents and secure cloud infrastructure from scratch, rather than simply wrapping existing agent SDKs.

  4. Model Agnostic - You can assign different AI models, like those from OpenAI, Anthropic, Alibaba Qwen, and Kimi AI, to different tasks based on what's best for the job.

  5. Now Available - You can start using Capy.ai right now by heading to their website. Enterprise teams interested in custom plans and pricing can book a consultation directly on the site.

A fun fact - these guys paid $100,000 to acquire the “cutest domain in tech.”

Forget frameworks that hide the mechanics in opaque, high-level APIs. This toolkit gives you direct, transparent control over every agent interaction.

Alibaba just dropped AgentScope, a Python framework that lets you build multi-agent AI applications with transparency as paramount - you can see exactly what your agents are thinking, which APIs they're calling, and how they're making decisions.

You can build everything from simple chatbots to sophisticated workflows where agents collaborate, use MCP tools, and maintain long-term memory. The framework handles the heavy lifting of agent coordination, message passing between agents, and integration with different LLM providers.

The framework also ships with AgentScope Studio, a web interface where you can visually design agent workflows, monitor conversations in real-time, and debug your multi-agent systems.

Key Highlights:

  1. Multi-Agent Workflows - Build AI teams where specialized agents handle different aspects of complex tasks, communicating through structured messages and shared memory systems.

  2. Developer Transparency - Complete visibility into agent reasoning, tool usage, and decision-making processes with no hidden abstractions that could surprise you in production.

  3. Tool-Enabled Agents - Agents can execute Python code, run shell commands, access APIs, and use custom tools via native MCP integration.

  4. Visual Workflow Designer - Studio provides a web interface for building agent interactions, monitoring live conversations, and debugging multi-agent systems without deep technical setup.

  5. Production Runtime Framework - The team has also shipped AgentScope Runtime that provides secure sandboxed tool execution and scalable deployment infrastructure that works with any agent framework, not just AgentScope.

The Daily Newsletter for Intellectually Curious Readers

Join over 4 million Americans who start their day with 1440 – your daily digest for unbiased, fact-centric news. From politics to sports, we cover it all by analyzing over 100 sources. Our concise, 5-minute read lands in your inbox each morning at no cost. Experience news without the noise; let 1440 help you make up your own mind. Sign up now and invite your friends and family to be part of the informed.

Quick Bites

Google’s new embedding model for on-device RAG
Google just dropped EmbeddingGemma, a new open embedding model designed specifically for on-device RAG applications and semantic search. It delivers private, high-quality embeddings that run anywhere, even offline. At just 308M parameters, it outperforms models twice its size on MTEB while sipping RAM at under 200MB. It's available now on Hugging Face, Kaggle, and Vertex AI with day-one support for popular frameworks like Ollama and transformers.js.

The $610 million acquisition no one saw coming
Atlassian just bought The Browser Company for a $610 million all-cash deal, and Arc users are having a full meltdown about it. After alienating Arc's passionate fanbase by shifting focus to the agentic browser Dia, the Atlassian acquisition feels like salt in the wound. Also, the pairing is extremely bizarre! Josh Miller wrote a lengthy, heartfelt blog about "winning" that rings hollow when your core community is already shopping for alternatives. That's an expensive way to lose your most loyal advocates.

Run Claude Code as a first-class citizen in Zed
Claude Code now runs natively in Zed through the Agent Client Protocol, joining Gemini CLI to use this open protocol. The integration runs Claude Code as a native agent within Zed's interface, letting you follow its multi-file edits in real-time with syntax highlighting and review changes through granular diff views instead of scrolling through terminal output. Zed has open-sourced the Claude Code adapter under the Apache license.

Branch conversations in ChatGPT
OpenAI has rolled out conversation branching in ChatGPT, now live on the web for logged-in users. You can branch off from any point in a chat to explore different directions without losing the original thread. Perfect for testing side prompts or running parallel ideas while keeping your main conversation clean.

Get Perplexity Pro free for a year with PayPal
If you use PayPal, this one’s hard to ignore! Perplexity Pro is free for a whole year. Connect your account, add billing, and you’ll get 12 months at $0 before the standard $20/month rate kicks in.

Tools of the Trade

  1. Pocket Agent - Run Claude Code, OpenAI Codex, Cursor CLI, and other coding agents on your phone. It uses a local server on your machine that exposes your dev environment on mobile, which can handle agent conversations, file editing, terminal sessions, and cloud background agents.

  2. Sourcerer - Don't waste LLM tokens reading entire files when you only need specific functions. This MCP server lets AI agents search code semantically and extract just the relevant chunks by using tree-sitter to parse your codebase into a searchable index.

  3. SwiftAI - An open-source Swift library that gives a unified API for integrating LLMs into iOS/macOS apps, automatically switching between Apple's on-device models and cloud services based on availability. Comes with structured outputs, tool calling, and stateful sessions.

  4. Awesome LLM Apps: A curated collection of LLM apps with RAG, AI Agents, multi-agent teams, MCP, voice agents, and more. The apps use models from OpenAI, Anthropic, Google, and open-source models like DeepSeek, Qwen, and Llama that you can run locally on your computer.
    (Now accepting GitHub sponsorships)

Hot Takes

  1. A chromium wrapper sold for $600M.

    A VSCode fork is valued at $10B.

    Yet you still want to build everything from scratch.

    Find the gap. ~
    jonah


  2. I don’t get why people get annoyed at being called an “LLM wrapper.” Look around—you are an LLM wrapper. We all are. ~
    Yuchen Jin

That’s all for today! See you tomorrow with more such AI-filled content.

Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!

Unwind AI - X | LinkedIn | Threads

PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉 

Reply

or to participate.