Slack for AI Employees

+ OpenClaw Clawsweeper for PRs and Issues

Today’s top AI Highlights:

& so much more!

Read time: 3 mins

AI Tutorial

I’ve been running a bunch of agents every day for months.

The problem is that I was the one constantly tweaking and learning while they just held onto context.

I tested it out by putting the same Monica agent on Hermes at the same time.

She started creating her own playbook from my edits and keeps getting better without me.

In this blog, I detailed how I did it. You’ll also understand the difference between an agent you manage and one that actually grows with you.

We share hands-on tutorials like this every week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

Don’t forget to share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads) to support us!

Latest Developments

This is Slack reimagined, where your AI employees communicate with each other and have a shared memory.

WUPHF is an open-source working version of this with a very fun structure: a multi-agent "office" where a CEO, PM, engineers, designer, and CRO coordinate with each other and ship work.

One command (npx wuphf) opens the office in your browser, complete with a #general channel and the team already inside, claiming tasks.

Underneath it all sits the Karpathy-style wiki with a plain markdown + git for storage, BM25 + SQLite for search, and that’s it! Agents draft in private notebooks, the durable stuff gets promoted to a shared team wiki, and a "Pam the Archivist" git identity signs every wiki commit.

It's MIT, self-hosted, BYO API keys, and the whole brain lives locally.

Key Highlights:

  1. One command launches a tmux/web office with a CEO, PM, engineers, designer, CMO, and CRO already inside a shared channel.

  2. The shared brain is the Karpathy markdown wiki: per-agent notebooks for rough drafts, with reviewed entries promoted to the team wiki.

  3. Fresh sessions per turn keep input flat at ~87k tokens with a 97% cache hit rate. Accumulated-session orchestrators climb to ~484k over the same window.

  4. Mix runtimes freely - Claude Code, Codex, and OpenClaw agents can share the same office and the same wiki.

Mistral Speech gives you three open-weights models in one coherent stack: batch transcription, live STT, and natural voice synthesis. Build and deploy a full audio pipeline via API, or self-host on-prem and at the edge.

Key Highlights:

  1. Voxtral Mini Transcribe 2 - ~4% word error rate on FLEURS at $0.003/min. Speaker diarization, context biasing for up to 100 terms, and recordings up to 3 hours in a single request.

  2. Voxtral Realtime - Live transcription with latency configurable down to sub-200ms. At 480ms, within 1-2% WER: the voice agent sweet spot. Runs on phones, laptops, and smartwatches.

  3. Voxtral TTS - 70ms model latency, ~9.7x real-time factor, voice cloning from 3 seconds of audio across 9 languages. Emotion-steering built in.

  4. Deployable anywhere - On-prem, private cloud, serverless, or Mistral Compute. GDPR and HIPAA-compliant.

4,000 GitHub issues closed in a single day, by a bot, not a burnout.

The creator of OpenClaw just shipped ClawSweeper, an automated maintenance bot that runs 50 Codex instances in parallel to scan, triage, and close stale or already-resolved issues and PRs across the OpenClaw repo.

Here's the context: OpenClaw's explosive growth left the repo sitting on 13,500+ open items. That’s an impossible backlog for any human team. ClawSweeper tackles this by running parallel review shards powered by GPT-5.4, each checking out the repo at main and deeply analyzing whether an issue has already been fixed, can't be reproduced, or is just too incoherent to act on. It only closes when the evidence is strong, and maintainer-authored items are never touched.

Key Highlights:

  1. ClawSweeper only closes for five specific reasons: already implemented, unreproducible, belongs as a skill/plugin, incoherent, or stale past 60 days. Everything else stays open, untouched.

  2. Codex runs without GitHub write tokens and against a read-only checkout. If it leaves even a single untracked file behind, the review fails.

  3. Every proposed close includes a hash of the issue's state at review time. If anything changes between proposal and apply, the closure is automatically skipped.

  4. A planner scans all open items and prioritizes by activity: active issues first, then PRs, then recent issues, then older weekly reviews. The most relevant work gets handled first.

Quick Bites

Building agents that reach production systems with MCP
Anthropic just dropped a guide on wiring up agents to production systems via MCP. It leans into hard-won patterns from running 200+ MCP servers at scale. If you're building agents that need to talk to real infrastructure behind auth, this is the playbook. Some of our takeaways:

  • Don't mirror your API 1:1 into MCP tools. Group tools around what the agent is actually trying to do. One create_issue_from_thread beats four chained primitives every time.

  • If your service has hundreds of operations (think AWS, K8s), skip the mega-toolset. Expose a thin tool surface that accepts code: the agent writes a short script, your server runs it in a sandbox against your API, and only the result returns. Cloudflare’s MCP inspired by Code Mode is a great example.

  • Instead of stuffing all tool definitions into context upfront, lazy-load them at runtime. Pair that with programmatic tool calling for another ~37% token reduction.

  • MCP gives agents access to tools; skills teach them how to use those tools well. Bundling both as plugins gives the best of both.

Open-source DeepSeek V4 is a bigger deal than R1
DeepSeek V4 is a 1.6T-param MoE model with 1M native context, MIT-licensed, scoring within a hair of Opus 4.6 on SWE-bench (80.6% vs 80.8%), and it costs $3.48 per million output tokens. For reference, GPT-5.4 charges $30. That's a different economic model entirely, and it's open-weight so you can self-host it. Add in that it was trained on Huawei Ascend 950 chips (not Nvidia). This a serious inflection point: the best open-source model in the world now sits at ~95% of frontier performance for ~12% of the price.

OpenAI open-sources a local-first PII redaction model
OpenAI dropped Privacy Filter, an open-weight 1.5B-parameter model (only ~50M active) that detects and redacts PII across eight categories like names, addresses, API keys, and the works in a single pass with a 128k context window. It runs locally on a laptop or even in a browser, so your sensitive text never has to leave your machine, and it ships under Apache 2.0, so you can fine-tune and commercialize freely.

Tools of the Trade

  1. HyperFrames: Open-source framework by HeyGen that turns HTML into rendered video. It ships as a Skill you can add to Claude (Code or Design) that teaches it how to write video compositions in HTML. Just prompt what you want, Claude authors the scenes with proper timelines and animations, and you render to MP4 locally.

  2. Hermes Labyrinth: An observability plugin for Hermes Agent. It turns agent activity into a map of crossings: prompts, tool calls, tool results, failures, model switches, subagents, approvals, memory hits, redactions, context compression, cron runs, and reportable evidence.

  3. Claude Code Hook - Context Timeline: A monitoring hook that visualizes your main agent's context window and all spawned subagents as a live timeline from session start. It makes it way easier to track what's happening across parallel contexts than squinting at console output.

  4. Clicky: A free Mac menu bar app by Farza that lets you talk to AI and spin up agents that can build apps, do research, and interact with native Apple apps like Notes, Calendar, and Reminders. It's designed for consumers with zero setup. Just install and start talking.

  5. Awesome LLM Apps - A curated collection of LLM apps with RAG, AI Agents, multi-agent teams, MCP, voice agents, and more. The apps use models from OpenAI, Anthropic, Google, and open-source models like DeepSeek, Qwen, and Llama that you can run locally on your computer.
    (Now accepting GitHub sponsorships)

That’s all for today! See you tomorrow with more such AI-filled content.

Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!

Unwind AI - X | LinkedIn | Threads

PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉 

Reply

or to participate.