unwind ai
Posts
Drag & Drop UI for Multi-Agent Apps

Drag & Drop UI for Multi-Agent Apps

PLUS: Long-term memory for AI agents, Deep Research for GitHub repos

Shubham Saboo & Gargi Gupta
April 30, 2025

In partnership with

Today’s top AI Highlights:

Opensource drag-and-drop UI for multi-agent workflows
Build production-ready AI agents with scalable long-term memory
ChatGPT Search and Perplexity on WhatsApp
Alibaba’s super agent app with MCP support on Android
Deep Research for GitHub repos powered by AI agent Devin

& so much more!

Read time: 3 mins

AI Tutorial

Charts, diagrams, and visual data in PDFs remain a massive blind spot for most RAG systems. While text-based RAG has become relatively straightforward to implement, extracting meaningful insights from visual elements requires specialized approaches that many developers struggle to implement efficiently. The standard workaround of OCR followed by text embedding loses crucial context and fails completely with complex visual elements.

In this tutorial, we'll build a cutting-edge Vision RAG system that uses Cohere's Embed-4 model to create unified vector representations that capture both visual and textual elements. Then, we'll use Google's Gemini 2.5 Flash to analyze these retrievals and generate comprehensive answers by fully understanding the visual context.

We share hands-on tutorials like this every week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

Build a Vision RAG App with Gemini 2.5 Flash

Fully functional multimodal RAG app with vision, with step-by-step instructions (100% opensource)

Don’t forget to share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to support us!

Latest Developments

Agent Workflow Builder Without Extra Abstraction Layers 🤖 🔌⚙️

Sim Studio is an open-source drag-and-drop interface for building multi-agent AI workflows visually. Design complex agent systems with intuitive directed graphs where you connect models, tools, and logic controls as nodes. The visual interface isn't just for planning—it's fully executable, letting you test and refine workflows before deployment. With support for both cloud and local models via Ollama, you can develop offline and deploy to the cloud when ready.

You might be wondering how this differs from other visual agent workflow builders like n8n, Flowise, and RAGFlow, which we have been using already. Sim Studio keeps things closer to what LLM providers actually use, with fewer abstraction layers between you and the models. Instead of generic parameters like "memory," you directly control the system prompts and tool definitions that determine how your agents behave.

Key Highlights:

Provider-aligned interfaces - Work directly with the native parameters each LLM provider uses rather than through abstraction layers. This means you have full control over system prompts, tool definitions, and temperature settings exactly as they'll be executed in production.
Model switching - Test your workflows with different models from OpenAI, Anthropic, Llama, and others without rewriting your agent logic. You can also use different models within the same workflow, allowing you to optimize each step for cost or performance by selecting the right model for specific tasks.
Testing and observability - Run simulations of your workflows multiple times to see how changes affect performance. The platform includes detailed execution logs, latency tracing, and error analysis to help you quickly identify and fix issues.
Deployment options - Deploy workflows as APIs with HTTP endpoints, schedule them to run periodically, or set them up to respond to incoming webhooks. You can also create standalone chat interfaces with password or domain protection for controlled access.

Free AI Resource

Unlock powerful workflows from two CEOs who are reinventing the future of work with AI.

In this free resource, you'll discover how forward-thinking leaders are using AI to streamline meetings, cut busywork, and automate decisions.

Get actionable insights that you can implement in your own remote teams today.

Get your free copy

Give AI Agents Scalable Long-Term Memory 📝 🧠💡

AI agents today often forget important details once a conversation gets too long. Opensource memory layer for AI agents, Mem0, has built a memory system that tackles this issue by building a scalable long-term memory system that picks and stores only the key facts, instead of bloating the context window.

It’s not just faster — Mem0 achieves 26% higher accuracy than OpenAI’s memory feature on the LOCOMO benchmark while cutting token usage by 90%. Plus, Mem0 reduces latency by 91%, making it much more suitable for production-level AI agent deployments.

Key Highlights:

How It Works - Mem0 uses a two-phase pipeline that extracts key facts from conversations and intelligently updates the memory store. The system decides whether to add new memories, update existing ones, delete contradictions, or do nothing - keeping the memory coherent and non-redundant. A graph-enhanced version (Mem0ᵍ) captures even more complex relationships between conversation elements.
Performance Boost - Mem0 achieved 26% higher accuracy than OpenAI Memory on the LOCOMO benchmark, demonstrating superior reasoning about past conversations. The system consistently outperformed six leading memory approaches across all question types, from multi-hop to temporal reasoning.
Speed Optimization - By retrieving only the most relevant facts instead of entire conversation histories, Mem0 cuts latency by 91% compared to full-context methods. This means near-instant responses (0.71s median) even when reasoning about extensive conversation history.
Cost Efficiency - Mem0 uses approximately 1,800 tokens per conversation compared to 26,000 for full-context methods - a 90% reduction. This makes long-term memory practical and affordable at scale, without compromising on quality.

Quick Bites

ChatGPT Search (+1-800-242-8478) and Perplexity (+1-833-436-3285) are now available on WhatsApp. You can send questions using text or images, about old or recent events like live sports scores, and get accurate, up-to-date answers with sources. Perplexity also supports image generation directly on WhatsApp.

HyperBrowser has released HyperAgent, an open-source AI layer over Playwright that lets you automate the browser with natural language. The framework introduces two key functions: page.ai for running automation through simple commands like "Find a route from Miami to New Orleans," and page.extract for pulling structured data from web pages without custom selectors. Featuring built-in stealth browsing to prevent detection, HyperAgent also supports integration with external tools as an MCP client and can scale to hundreds of concurrent sessions through HyperBrowser.

Meta just launched the Llama API in preview at its first-ever LlamaCon. It offers easy API access, interactive playgrounds, and SDKs for Python and TypeScript to build with Llama models like Llama 4 Scout and Maverick. Developers can fine-tune models, run evaluations, and even use fast inference through Cerebras and Groq—all while keeping control of their weights and data.

Baidu has released "Xinxiang," a general-purpose super agent that can handle complex multi-step tasks across work, study, and daily life scenarios from a single prompt. It is based on a multi-agent system that can plan, collaborate, and execute tasks autonomously. Currently supporting over 200 task types ranging from routine work to legal consultations and homework, Xinxiang aims to expand to 100,000+ task types. Already available on Android, iOS app coming soon.

Tools of the Trade

DeepWiki: Cognition’s AI SWE agent Devin can now do Deep Research on any GitHub repo. Just go to the website or swap Github with deepwiki in the repo URL and chat with Devin to get in-depth answers. 30,000 repos already indexed. 100% free for opensource repos.
Flowcode: A visual programming tool that lets you build full backend logic with blocks instead of code, while still keeping features like loops, conditions, and concurrency. It runs inside VSCode, supports TypeScript, and works with your existing codebase.
Pocket Flow Tutorial: Turns GitHub repo into easy tutorials. It crawls the repos and builds a knowledge base from the code. It analyzes entire codebases to identify core abstractions and how they interact, and transforms complex code into beginner-friendly tutorials with clear visualizations.
Awesome LLM Apps: Build awesome LLM apps with RAG, AI agents, MCP, and more to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos, and automate complex work.

Hot Takes

Literally the only thing that matters about software quality is observability
If you can't see what's going wrong (or you just don't care), you can't fix it. ~
Matt Pocock
Much of the AI industry is caught in a particularly toxic feedback loop rn.
Blindly chasing better human preference scores is to LLMs what chasing total watch time is to a social media algo. It's a recipe for manipulating users instead of providing genuine value to them.
There's a reason you don't find Claude at #1 on chat slop leaderboards. I hope the rest of the industry realizes this before users pay the price. ~
Alex Albert

That’s all for today! See you tomorrow with more such AI-filled content.

Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!

Unwind AI - X | LinkedIn | Threads | Facebook

Awesome LLM Apps | Sponsor Us

PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉

Reply

or to participate.