unwind ai
Posts
Claude 4-Level Opensource Model from China

Claude 4-Level Opensource Model from China

PLUS: Run Claude Code, Gemini CLI, Codex in parallel, Ollama for smartphones

Shubham Saboo & Gargi Gupta
July 14, 2025

Today’s top AI Highlights:

After DeepSeek, there's a new Claude 4-level model from China
Run Claude Code, Gemini CLI, Codex, and Amp in parallel
Google acqui-hired Windurf - what exactly happened?
Connect any MCP server to any LLM in <30 seconds
Ollama for smartphones

& so much more!

Read time: 3 mins

AI Tutorial

Business consulting has always required deep market knowledge, strategic thinking, and the ability to synthesize complex information into actionable recommendations. Today's fast-paced business environment demands even more - real-time insights, data-driven strategies, and rapid response to market changes.

In this tutorial, we'll create a powerful AI business consultant using Google's Agent Development Kit (ADK) combined with Perplexity AI for real-time web research. This consultant will conduct market analysis, assess risks, and generate strategic recommendations backed by current data, all through a clean, interactive web interface.

We share hands-on tutorials like this every week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

Build an AI Consultant Agent with Gemini 2.5 Flash

Fully functional agentic app in under 100 lines of code (100% opensource)

Don’t forget to share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads) to support us!

Latest Developments

Silicon Valley Got DeepSeek’d Again 🤯🧑‍💻💪

Last week felt like we literally jumped 6 months ahead. We had Grok 4 with surreal benchmark numbers, and now there’s a new Claude 4 level open model from China that outperforms DeepSeek v3, Qwen, and OpenAI GPT-4.1.

And this coding beast costs ~5x less than Claude 4 Sonnet.

Moonshot AI just opensourced Kimi K2, a 1 trillion parameter MoE model with 32B active parameters that's built specifically for agentic tasks and costs a mere $0.60 per million input tokens. The model achieves state-of-the-art performance among open models on SWE-bench Verified, Tau2, and AceBench while offering the same advanced tool-calling and MCP capabilities you'd expect from frontier models.

You can literally run Claude 4 Sonnet-level intelligence on your own hardware for a fraction of the cost.

Key Highlights:

Insane pricing advantage - At $0.60 input / $2.50 output per million tokens, it's 5x cheaper than Claude 4 Sonnet ($3/$15) and 3.3x cheaper than GPT-4.1 ($2/$8), making frontier-level AI accessible to everyone.
Built for agents from the ground up - Unlike general-purpose models retrofitted for tool use, Kimi K2 was specifically optimized for agentic workflows with seamless tool calling, planning, and execution across complex multi-step tasks.
Opensource Models - Both base and instruct versions are fully open-sourced on HuggingFace, letting you fine-tune, self-host, or deploy however you want without vendor lock-in.
Coding dominance - Achieves SOTA performance among open models on SWE-bench Verified and consistently outperforms much larger proprietary models on real-world coding benchmarks and agentic evaluations.

Rumours on the street are that OpenAI has delayed the release of its open-weight models after this release, in the name of “safety tests and reviews.”

Vibe Code with Multiple AI Coding Agents in Parallel 🤖🖥️🤖🖥️

The current generation of AI coding agents can write entire features in minutes, but we're still using them like it's 2023 - one task, one agent, one terminal window at a time.

That's like having a team of expert developers but only letting one person work while the others sit idle.

Vibe Kanban solves this babysitting-one-agent-at-a-time problem. You can queue up multiple coding tasks across Claude Code, Gemini CLI, OpenAI Codex, and Amp, and track what they’re all doing through a clean kanban interface. Review completed work, plan future features, and track progress while your AI team handles the implementation details in the background.

Key Highlights:

Multi-agent orchestration - Run coding agents in parallel or sequence across different tasks, maximizing throughput while agents work in the background without your constant supervision.
CLI agents support - Switch seamlessly between Claude Code, Gemini CLI, and Amp without changing your workflow or reconfiguring your setup.
Task status tracking - Visual kanban board shows exactly which agents are working on what, with real-time status updates so you know when to review completed work.
Centralized configuration - Manage all your coding agent settings and MCP configurations from one place, eliminating the need to configure each tool separately.

Quick Bites

What happens when your biggest investor becomes your biggest problem?

OpenAI's $3 billion deal to acquire coding darling Windsurf spectacularly collapsed after Microsoft, ironically OpenAI's largest backer, effectively torpedoed the acquisition by demanding access to Windsurf's intellectual property under their existing partnership agreement.
Windsurf CEO Varun Mohan reportedly made it crystal clear he didn't want Microsoft anywhere near his startup's tech, given GitHub Copilot's position as a direct competitor.
Google swooped in with precision, paying $2.4 billion to hire Mohan, co-founder Douglas Chen, and key researchers for DeepMind while securing a non-exclusive license to Windsurf's technology, without acquiring the company outright.
And now, the remaining 200+ Windsurf employees who aren't Google-bound have an uncertain future (does anyone remember Character AI and Inflection AI anymore?). This reverse-acquihire strategy brilliantly sidesteps antitrust scrutiny while delivering maximum competitive damage - a masterclass in how to exploit your rival's partnership constraints.

Google just shipped photo-to-video capabilities in Veo 3, now available to Pro and Ultra subscribers in the Gemini app. You can now animate your still images into 8-second clips with synchronized audio.

Alibaba Qwen released its own desktop app to chat with Qwen models, with full MCP support. The app includes web search via Firecrawl to give up-to-date context to the models. You can also extend the models’ capabilties by adding MCP servers through a simple JSON config.

AI coding agents might not be the productivity boosters we thought they were. A rigorous randomized controlled trial by METR found that experienced opensource developers actually took 19% longer to complete tasks when using frontier AI tools like Cursor Pro with Claude 3.5 Sonnet, contradicting both developer expectations and expert forecasts. The twist: developers still believed AI had sped them up by 20% even after experiencing the slowdown firsthand.

Study involved 16 experienced developers working on 246 real issues from repositories averaging 22k+ stars
Five contributing factors identified: increased debugging time, over-reliance on AI suggestions, context switching overhead, validation burden, and reduced code comprehension
Results challenge the validity of current coding benchmarks like SWE-Bench that show impressive AI performance
METR plans to repeat this methodology to track AI R&D acceleration trends over time

Tools of the Trade

Cactus: Opensource mobile inference framework that runs AI models locally on smartphones. It supports any GGUF model from Hugging Face with quantization from FP32 to 2-bit, offers MCP tool-calls for device integration, and includes cloud fallback for complex tasks.
TXT OS: A reasoning framework as a plain text file that enhances any LLM with semantic memory and hallucination detection capabilities. You download the .txt file, paste it into any LLM chat interface, and it provides structured reasoning with memory persistence across conversations.
Director: Opensource local-first MCP gateway that connects Claude, Cursor, or VSCode to any MCP server within seconds. It’s a proxy layer between MCP clients and servers, eliminating JSON configuration requirements and setup friction.
Awesome LLM Apps: Build awesome LLM apps with RAG, AI agents, MCP, and more to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos, and automate complex work.

Hot Takes

OpenAI Sam Altman is pretty fucked right about now
> lost half of their top talent over the past two weeks
> basically all of their leadership over the past two years
> sama told microsoft can’t get access to windsurf’s IP
> microsoft paused negotiations with Sam Altman
> openai doesn’t have $$$ to fund the $3 billion acquisition anymore
> windsurf employees thought they were going to get a bag joining openai
> windsurf CEO backed out if the deal
> windsurfs core talent got poached by google instead
> Grok 4 better than expected
> gpt-5 delayed
> only way oai can stay relevant is to release open-source model now
> but already mogged pre-release by chinese models
> openai valuation in private equity been dropping past few weeks
> softbank masa $40 billion investment paused
> oai for-profit conversation deadline: 5 months left
OpenAI is a ticking bomb… ~
NIK
Grok 4 might just be the best advertisement for AI safety ever. ~
Pietro Schirano

That’s all for today! See you tomorrow with more such AI-filled content.

Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!

Unwind AI - X | LinkedIn | Threads

Awesome LLM Apps | Sponsor Us

PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉

Reply

or to participate.