• unwind ai
  • Posts
  • Claude Code now spins up 100s of parallel agents on one task

Claude Code now spins up 100s of parallel agents on one task

+ Apple drops a native AI framework for on-deivce AI agents

Today’s top AI Highlights:

  1. Apple Core AI Framework for on-device agents

  2. Printing Press: Print agent-native CLIs from a single prompt

  3. Claude Code Dynamic Workflows with massive parallelism

  4. Your job is to write agent loops now

  5. ChatGPT now "dreams" to build better memory

& so much more!

Read time: 3 mins

AI Tutorial

The frontend used to be a fixed thing. Designers drew it. Engineers built it. Users got what shipped.

That's over.

The interfaces shipping in 2026 are drawn partly by the agent itself, in real time, from what the user actually asked for. Ask for a table, get a table. Not a paragraph describing one.

Generative UI is the layer that lets agents stop describing and start showing.

This guide walks you through three patterns that have emerged on how to build it, and the differences between them matter more than most teams realize.

Don’t forget to share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads) to support us!

Latest Developments

Forget calling external APIs. Apple's new Core AI framework, announced at WWDC 2026, gives developers Swift-native access to Apple's on-device foundation models with tool calling, structured generation, and full Apple Intelligence integration.

This is the first time Apple has opened up its on-device models as a developer-facing framework. You write Swift, define tools, and the model runs locally on the device with zero cloud round-trips. Privacy-first by default, no API keys, no usage-based pricing, no latency from network calls. For anyone building iOS or macOS apps, this changes how you think about adding intelligence to your product.

The bigger picture: Apple also revealed that Apple Intelligence is now co-developed with Google using Gemini models under the hood, running both on-device and through Private Cloud Compute. A system orchestrator automatically coordinates AI features across apps.

Key Highlights:

  1. Tool calling in Swift: Define custom tools that the on-device model can invoke, enabling agentic workflows entirely on the user's device without a server.

  2. Structured generation: Get typed, schema-conforming outputs from the model, not just raw text. Build reliable features without post-processing hacks.

  3. Gemini under the hood: Apple Intelligence now runs on foundation models co-developed with Google, giving the platform multimodal capabilities, including image generation, visual Q&A, and speech generation.

  4. No cloud dependency: Models run locally. Your users' data stays on their devices. No API costs, rate limits, or cold starts.

  5. Available now: Core AI ships with iOS 27, macOS 27, and the latest Xcode. Documentation is live at developer.apple.com/documentation/coreai.

What if every app, API, and website your agent needs came as a purpose-built CLI with a local SQLite mirror, compound commands, and token-efficient output?

That's Printing Press by Matt Van Horn and Trevin Chow. Point it at an API spec, a website URL, or even a service with no public API, and it generates a Go CLI, a Claude Code skill, an OpenClaw skill, and an MCP server. All from one prompt.

Super interesting concept: a local SQLite mirror beats a remote API call. Compound commands beat ten round trips. An agent-native CLI beats raw HTTP. When you "print" an ESPN CLI, you don't get a thin wrapper around REST endpoints. You get live scores, series state, leading scorers, and injury news in one call, all queried from a local database that syncs incrementally. Same goes for Linear, Slack, Notion, or any of the 237+ CLIs in their Public Library.

Key Highlights:

  1. No API needed: For services without a public API, Printing Press launches a browser, captures traffic, reverse-engineers the endpoints, and generates the spec automatically. If you can click through it, the press can build a CLI.

  2. Local-first data layer: High-gravity resources get domain-specific SQLite tables with FTS5 full-text search and incremental sync. Queries run in milliseconds offline. Your agent never waits for a 429.

  3. 237+ community CLIs: The Public Library ships pre-built CLIs across 19 categories, from flight search to restaurant reservations to eBay auctions. Install with one command or let your agent browse and pick what it needs.

  4. Token-efficient by default: --compact mode cuts 60-80% of tokens. Auto-JSON when piped. Typed exit codes for agent self-correction. The CLI is built for agents first, humans second.

  5. Try it now: Install via Go, add the Claude Code or OpenClaw skills, and run /printing-press <app> inside your agent. Check it out at printingpress.dev.

Some problems are too big for one agent in one pass. A bug hunt across an entire service. A migration that touches hundreds of files. A plan you want stress-tested from every angle before committing.

Anthropic's Dynamic Workflows for Claude Code changes the math entirely. Claude writes a custom JavaScript orchestration script on the fly, fans work out across 100s of parallel subagents, has independent agents try to break each other's results, and keeps iterating until answers converge. The coordination lives in code, not context, so the plan stays on track no matter how big the task gets.

The proof of concept is wild: Jarred Sumner used Dynamic Workflows to port Bun from Zig to Rust. 750,000 lines of Rust, 99.8% test suite passing, eleven days from first commit to merge. Hundreds of agents worked in parallel with two reviewers on each file.

Key Highlights:

  1. Claude writes the orchestrator: No pre-built templates. Claude generates a bespoke JS script tailored to your specific task, then runs it. Every workflow is custom.

  2. Independent verification built in: Agents tackle the problem from different angles, other agents try to refute what they found, and the run iterates until results converge. This is how it catches things a single pass misses.

  3. Resumable and saveable: Progress is checkpointed. Interrupted jobs pick up where they left off. Save a workflow as a reusable /command for future sessions with structured input parameters.

  4. Token warning: These workflows consume meaningfully more tokens than a typical session. Start with a scoped task to get a feel for usage before throwing it at your whole codebase.

  5. Available now: Research preview on Max, Team, and Enterprise plans, plus the Claude API, Amazon Bedrock, Vertex AI, and Microsoft Foundry. Requires Claude Code v2.1.154+. Turn on ultracode effort level or just ask Claude to "create a workflow."

Quick Bites

Don’t Prompt Agent, Design Loops that prompt your agent (WTF!)
Peter Steinberger posted six words on Saturday that hit 6.3 million views: "You should be designing loops that prompt your agents." Boris Cherny, the creator of Claude Code, said the same thing a few days earlier: "I don't prompt Claude anymore. I have loops running. They're the ones prompting Claude." If all of this chatter left you wondering what the hell a loop even is, Matt Van Horn wrote the definitive explainer. He traces the concept all the way back to the 2022 ReAct paper, through Geoffrey Huntley's ralph loop, to today's multi-agent orchestration loops that run on cron and survive restarts. Worth the read.

Apple Intelligence, take two
Remember when Apple announced Apple Intelligence with ChatGPT as the backup brain at WWDC 2024? The "smart Siri" with personal context, on-screen awareness, etc? Most of it never shipped, and apparently, “it wasn't good enough" and "didn't converge quality-wise." Two years later, they're trying again, this time with Google. The new Apple Intelligence, announced at WWDC yesterday, is co-developed with Google using Gemini as the foundation, not just a fallback. On-device and Private Cloud Compute, multimodal everything, and a conversational Siri that's getting its own standalone app.

ChatGPT now "dreams" to remember you better
OpenAI shipped Dreaming V3, and the name is apt. ChatGPT now runs a background memory synthesis process when you're not chatting, analyzing your conversation history and building a unified "Memory Summary" that stays current over time. The old "saved memories" approach (manually saying "remember this") is gone. Now it automatically captures context from natural conversation and updates temporal facts, so it knows your Singapore trip is in the past, not upcoming. Available on Plus and Pro now, rolling out to free users soon after a 5x compute reduction made it feasible at scale. The direction is clear: persistent, stateful AI assistants where memory is infrastructure, not a feature you toggle on.

Google Research tackles RAG's biggest problem
Standard RAG retrieves once and hopes for the best. Google Research's new "Agentic RAG" for their Gemini Enterprise Agent Platform retrieves, checks if it got enough, and goes back for more. The key innovation is a "Sufficient Context Agent" that inspects retrieved snippets, evaluates a draft response, identifies exactly what's missing, and sends targeted follow-up searches. It's a multi-agent pipeline: orchestrator, planner, query rewriter, search fanout, and synthesis. The result is a 34% accuracy improvement over standard RAG on factuality benchmarks with negligible latency overhead. The pattern is the real takeaway here, even if you're not on Google Cloud.

Tools of the Trade

  1. Agent-Reach: Deploy one AI agent across Telegram, Discord, Slack, WhatsApp, Web, and CLI simultaneously from a single codebase. It normalizes messages into a consistent format per platform and auto-adapts responses. Ships with ready-made adapters for LangChain, CrewAI, and AutoGen.

  2. claude-howto: A visual, example-driven guide to Claude Code covering everything from basic setup to advanced workflows like multi-agent orchestration and custom skills. Super useful whether you're just starting with Claude Code or trying to level up.

  3. last30days-skill: An agent skill that synthesizes research across X, Reddit, HN, YouTube, TikTok, and GitHub from the last 30 days on any topic you give it. Matt Van Horn used it to write that viral loops article, running it against the word everyone was fighting about.

  4. agentcookie: Continuously syncs your browser cookies, bearer tokens, and API keys from your daily-driver Mac to a second Mac where your agents run, encrypted over Tailscale. Your agents wake up authenticated to every service you use, zero per-site login ceremony, no cloud middleman.

  5. Awesome LLM Apps (113k+ 🌟 ) - A curated collection of LLM apps with RAG, AI Agents, multi-agent teams, MCP, voice agents, and more. The apps use models from OpenAI, Anthropic, Google, and open-source models like DeepSeek, Qwen, and Llama that you can run locally on your computer.
    (Now accepting GitHub sponsorships)

That’s all for today! See you tomorrow with more such AI-filled content.

Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!

Unwind AI - X | LinkedIn | Threads

PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉 

Reply

or to participate.