unwind ai
Posts
Google's Open-source SDK for Building Production AI Apps

Google's Open-source SDK for Building Production AI Apps

+ OpenAI and Google claim gold at the ICPC World Finals

Shubham Saboo & Gargi Gupta
September 18, 2025

Today’s top AI Highlights:

& so much more!

Read time: 3 mins

AI Tutorial

Learn OpenAI Agents SDK from zero to production-ready!

We have created a comprehensive crash course that takes you through 11 hands-on tutorials covering everything from basic agent creation to advanced multi-agent workflows using OpenAI Agents SDK.

What you'll learn and build:

Starter agents with structured outputs using Pydantic
Tool-integrated agents with custom functions and built-in capabilities
Multi-agent systems with handoffs and delegation
Production-ready agents with tracing, guardrails, and sessions
Voice agents with real-time conversation capabilities

Each tutorial includes working code, interactive web interfaces, and real-world examples.

The course covers the complete agent development lifecycle: orchestration, tool integration, memory management, and deployment strategies.

Everything is 100% open-source.

We share hands-on tutorials like this every week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

OpenAI Agents SDK Crash Course

Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini, and open-source models. - Shubhamsaboo/awesome-llm-apps

Don’t forget to share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads) to support us!

Latest Developments

Google’s Open-source Framework to Build Apps with LLMs and Agents 🤖📲

You might have tried Google’s Firebase Studio, their cloud-based IDE to build, test, and deploy applications. But did you know that the framework that powers Firebase Studio is completely open-source?

GenKit gives you the same production-ready framework and unified APIs with complete freedom over your stack and deployment choices. It was designed specifically for building serious AI applications that need to scale.

GenKit supports JavaScript, Go, and Python with consistent APIs across all three languages, and lets you switch between OpenAI, Anthropic, Gemini, Ollama, and other model providers by changing a single string parameter. The framework handles complex features and workflows like RAG pipelines, agentic tool calling, multi-turn conversations, multi-agent teams, multimodal inputs, structured outputs, real-time streaming, and more.

Key Highlights:

Decoupled & Model-Agnostic - GenKit’s plugin architecture abstracts away model-specific implementations, allowing you to swap between Gemini, Claude 3, or open-source models via Ollama without refactoring core logic. This design also extends to vector stores.
Complete RAG Stack - Built-in integrations with vector databases (Pinecone, Chroma, LanceDB, pgvector), embedding models, and retrieval pipelines, plus custom retriever support for any data source you need to connect.
MCP & Tool Ecosystem - Native Model Context Protocol support lets you connect AI directly to databases, filesystems, and APIs, while the extensive plugin ecosystem covers everything from vector stores to evaluation metrics.
Testing - Developer UI for testing and debugging, comprehensive evaluation frameworks with automated metrics, and detailed observability with execution traces.
Deploy Anywhere - While it offers tight integration with Google Cloud Run and Firebase, the framework is platform-agnostic. You can deploy your Node.js or Go applications to any environment you choose

Turn Claude Code into a Domain-specific Coding Agent 📒💡

General-purpose coding agents like Claude Code are great at writing code that uses popular libraries. But they crash and burn when you give them a custom library, internal APIs, or that niche framework your team swears by.

LangChain's team decided to fix this problem by running detailed experiments on Claude Code configurations to see what actually works for domain-specific coding tasks.

They tested everything from vanilla Claude to sophisticated MCP servers with full documentation access, measuring not just whether code compiles but whether it follows best practices and avoids common pitfalls. The best results didn’t come from the most complex setup, but a carefully crafted approach that balances foundational knowledge with selective deep-diving capabilities.

Key Takeaways:

Context overload kills performance - Dumping large documentation files crowds the context window and leads to poor results, even though it seems logical to give agents more information to work with.
Strategic tool usage requires guidance - Agents rarely invoke documentation tools effectively on their own, typically stopping at surface-level descriptions instead of following through to get the details they actually need.
Instructions have massive ROI - A well-written Claude.md file, that highlights core concepts, unique functionality, and common primitives in your library, delivers the highest payoff per token and costs significantly less to run than complex MCP server setups.
Claude + Claude.md + MCP wins - While Claude.md provides the most mileage per token, the strongest results came from pairing it with an MCP server, driven by Claude.md that included reference URLs at the end of each section to look for further information.

This blog is a masterclass in context engineering, especially for anyone building production agents that need to excel at framework-specific tasks rather than just generic coding.

Fact-based news without bias awaits. Make 1440 your choice today.

Overwhelmed by biased news? Cut through the clutter and get straight facts with your daily 1440 digest. From politics to sports, join millions who start their day informed.

Quick Bites

Hand-picked models provided by Opencode for peak AI coding
Opencode, the open-source CLI coding agent, just launched "zen," a curated gateway that solves the annoying inconsistency problem with AI models. Instead of wondering why Claude seems "dumber" today or getting routed to subpar providers, zen gives you access to hand-tested models that they optimally deployed themselves, including GPT-5, Claude Sonnet 4, Qwen3 Coder, Grok Code Fast, Kimi K2, and more. They're offering it at cost. It’s completely optional and you don’t need to use it to use Opencode.

Google and OpenAI claim gold at the ICPC World Finals
From 2025 IMO to now ICPC World Finals, AI systems are now outperforming humanity's finest minds back-to-back. Google's Gemini 2.5 Deep Think and OpenAI's reasoning models both achieved gold-medal level performance at ICPC, the world's most prestigious university programming competition. Google's system solved 10/12 problems, while OpenAI claimed a perfect 12/12 score using an ensemble of GPT-5 and an experimental reasoning model.

The ICPC operates differently from other competitions like the IMO - only the top 4 teams out of 139 receive gold medals, making "gold-medal level" relative rather than an absolute threshold. Both AI systems competed under official ICPC rules with 5-hour time limits, though Google participated in real-time while OpenAI evaluated their models on the same problems.

Gemini solved one problem that stumped every human team - it involved optimizing liquid flow through interconnected ducts to fill reservoirs as quickly as possible. Gemini cracked this within 30 minutes while no human team found a solution.

Anthropic never reduced Claude’s model quality due to demand
Think Claude's been getting dumber lately? It wasn't your imagination, and it definitely wasn't Anthropic quietly downgrading the model during busy periods. They released a detailed postmortem report on why Claude’s response degraded since August 2025. Turns out it was 3 simultaneous infrastructure bugs - a context window routing error, output corruption, and a nasty XLA compiler bug that made token selection go haywire. The team has given vast details on the bugs and how they are trying to fix them. While we really appreciate their transparency here, would it be fair to ask for compensation for weeks of degraded performance and lost productivity? 🤔

Tools of the Trade

Metorial MCP Containers - Provides Docker-packaged versions of 100s of MCP servers. Just pull the Docker image and run in isolated containers. New images are automatically built whenever changes are made to the corresponding server repositories.
RunRL - Reinforcement learning-as-a-service for training models with a variety of reinforcement learning algorithms on arbitrary prompt and reward files. You define what constitutes good and bad outputs, and the platform optimizes the model accordingly using RL algorithms.
pg-mcp - MCP server for AI agents to query PostgreSQL databases. It runs locally, so your data never leaves your system, only the database schema is shared with the LLM to generate accurate queries. It's multi-tenant and runs over HTTP/SSE (not stdio).
Awesome LLM Apps - A curated collection of LLM apps with RAG, AI Agents, multi-agent teams, MCP, voice agents, and more. The apps use models from OpenAI, Anthropic, Google, and open-source models like DeepSeek, Qwen, and Llama that you can run locally on your computer.
(Now accepting GitHub sponsorships)

Hot Takes

your data and footprint is becoming more and more valuable to you over time. record everything, have a model parse it all in ten years ~
roon
mcp servers are awesome they
- expose tools that never get called
- bloat context and then you ask me why opencode is using so many tokens
- crash constantly and break everything when they do ~
dax

That’s all for today! See you tomorrow with more such AI-filled content.

Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!

Unwind AI - X | LinkedIn | Threads

Awesome LLM Apps | Sponsor Us

PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉

Reply

or to participate.