Gemma 4 Runs on Your Phone

+ Cursor is becoming Claude Code and Claude Code is becoming OpenClaw

Today’s top AI Highlights:

& so much more!

Read time: 3 mins

AI Tutorial

Six AI agents run my entire life while I sleep.

Not a demo. Not a weekend project.

A real team that works 24/7, making sure I'm never behind. Research done. Content drafted. Code reviewed. Newsletter ready. By the time I open Telegram in the morning, they've already put in a full shift.

By the end of this, you will understand exactly how to build an autonomous AI agent team that runs while you sleep.

We share hands-on tutorials like this every week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

Don’t forget to share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads) to support us!

Latest Developments

Google's open-source models just got really good at being small.

Gemma 4 ships in four sizes, from a 2B edge model that runs on a Raspberry Pi to a 31B dense model that ranks #3 on Arena AI's text leaderboard, and the whole family drops under Apache 2.0.

Built on the same research behind Gemini 3, every model is built for agentic workflows from the ground up with built-in reasoning, function calling, and structured output. All models are natively multimodal (text, image, video, audio across sizes) and support up to 256K context windows.

This is Google making its strongest open-weight play yet!

Key Highlights:

  1. MoE that punches up: The 26B model activates just 3.8B parameters per forward pass but hits an Arena AI ELO of 1441, making it one of the most parameter-efficient reasoning models out there.

  2. Edge-Native: E2B and E4B are purpose-built for phones, IoT, and Raspberry Pi, running fully offline with near-zero latency and support for text, image, and audio input.

  3. Agentic by Default: Native function calling, structured JSON output, multi-step planning, configurable thinking modes, and even bounding box output for UI element detection, all baked in across every model size.

  4. Apache 2.0 License: A major shift from previous Gemma releases. No MAU limits, no usage restrictions.

Forget managing one agent at a time. Cursor now lets you run a whole fleet.

With Cursor 3, the team has rebuilt the interface from scratch. It’s no longer just an IDE with AI bolted on, but as a unified workspace where fleets of coding agents run in parallel across multiple repos.

The new layout puts all your agents (local, cloud, mobile, Slack, GitHub, Linear) in one sidebar, so you're not tab-hopping between terminals and tools to track what's happening.

It ships with Composer 2, their in-house frontier coding model, an integrated browser for testing, and a plugin marketplace with MCPs, skills, and subagents. You can always drop back into the full IDE when you need to go deeper.

  1. Parallel multi-repo agents: Run multiple local and cloud agents simultaneously across different repositories, all visible and manageable from a single sidebar.

  2. Local ↔ cloud handoff: Move any agent session between your machine and the cloud in seconds, so long-running tasks keep going even when your laptop is closed.

  3. Commit-to-PR workflow: A streamlined diffs view lets you edit, review, stage, commit, and open PRs without switching tools.

  4. Plugin marketplace: Browse and install hundreds of plugins (MCPs, skills, subagents) with one click, or set up a private team marketplace.

Claude Code now runs like vim - fullscreen, flicker-free, and with actual mouse support inside your terminal.

Anthropic just shipped an experimental NO_FLICKER rendering mode that virtualizes the entire terminal viewport, solving the age-old problem of screens jumping and flashing while your agent works.

The new renderer draws on the terminal's alternate screen buffer (the same trick vim and htop use), only painting messages currently visible on screen. This keeps memory and CPU usage flat no matter how long your conversation gets. It’s a big deal for those marathon coding sessions. It's especially noticeable in VS Code's integrated terminal, tmux, and iTerm2, where rendering throughput has always been the bottleneck.

Enable it with CLAUDE_CODE_NO_FLICKER=1 claude and you're in.

Quick Bites

Karpathy’s workflow for building self-improving wikis
Karpathy shared his current obsession: using LLMs to build personal knowledge wikis. The workflow is satisfyingly simple: dump raw research (papers, articles, repos) into a folder, let an LLM compile it into interconnected Markdown files with summaries and backlinks, and browse it all in Obsidian. The wiki compounds over time as queries generate new outputs (slides, charts, articles) that loop back in, and at ~400K words his setup handles complex Q&A without needing fancy RAG. Just well-maintained index files and LLM-driven "health checks" that lint for inconsistencies and suggest new threads to pull. Do give it a read!

Google’s guide to building ADK agents with Skills
Google just published a hands-on ADK guide that builds up to the interesting part: agents that write their own skills at runtime using the agentskills.io spec. The generated output is cross-compatible with Gemini CLI, Claude Code, Cursor, and dozens more. Four clean patterns, well-explained, and genuinely useful if you're building agentic workflows. Really good read, do check it out.

Alibaba's Qwen 3.6-Plus goes all in on agentic coding
Alibaba just dropped Qwen 3.6-Plus, and it's laser-focused on agentic coding. Think autonomous task breakdown, iterative debugging, and even generating front-end pages from screenshots. It's a native multimodal model with a 1M-token context window, and the benchmarks put it right alongside Claude Opus 4.5 on SWE-bench and Terminal-Bench 2.0. Now GA through API. You can integrate it with coding agents like OpenClaw, Claude Code, Qwen Code, and more.

Gemini 3 Flash in Gemini CLI: Pro brains, Flash speed
Google brought Gemini 3 Flash into Gemini CLI and the numbers are hard to ignore: 78% on SWE-bench Verified, which actually outperforms Gemini 3 Pro on agentic coding. The CLI now auto-routes between Flash and Pro, so you get near-instant responses on most tasks and only burn Pro-tier compute when the reasoning genuinely demands it.

Tools of the Trade

  1. Pika Skill: A real-time video chat skill, powered by Pika Labs’ new PikaStream 1.0 model, that gives any agent (OpenClaw, Claude Code, etc.) a face and voice on a live call. The agent doesn't just talk; it can actually perform actions mid-conversation while preserving its memory and personality.

  2. AutoClaw by Z.ai: Run OpenClaw locally with no API key required, just download and start. It ships with GLM-5-Turbo optimized for tool calling and multi-step tasks, but you can bring any model you want, and your data stays entirely on your machine.

  3. Lemonade: Bundles multiple inference engines (llama.cpp, whisper.cpp, stable-diffusion.cpp, Kokoro) behind a single OpenAI-compatible local server for text, image, speech, and transcription workloads. It auto-configures for your hardware, runs on Windows/Linux/macOS, and has a growing app marketplace with integrations for tools like n8n, Continue, and OpenHands.

  4. Awesome LLM Apps - A curated collection of LLM apps with RAG, AI Agents, multi-agent teams, MCP, voice agents, and more. The apps use models from OpenAI, Anthropic, Google, and open-source models like DeepSeek, Qwen, and Llama that you can run locally on your computer.
    (Now accepting GitHub sponsorships)

Hot Takes

  1. Prediction: This is gonna kill some oss projects.

    "On the kernel security list we've seen a huge bump of reports. We were between 2 and 3 per week maybe two years ago, then reached probably 10 a week over the last year with the only difference being only AI slop, and now since the beginning of the year we're around 5-10 per day depending on the days (fridays and tuesdays seem the worst). Now most of these reports are correct, to the point that we had to bring in more maintainers to help us."
    ~ Peter Steinberger

  2. Based on the leaked Claude Code source code, your CLAUDE.md file is re-injected on every single turn of the conversation.
    ~ Andriy Burkov

That’s all for today! See you tomorrow with more such AI-filled content.

Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!

Unwind AI - X | LinkedIn | Threads

PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉 

Reply

or to participate.