- unwind ai
- Posts
- Drag & Drop UI for Multi-Agent Apps
Drag & Drop UI for Multi-Agent Apps
PLUS: Long-term memory for AI agents, Deep Research for GitHub repos
Today’s top AI Highlights:
Opensource drag-and-drop UI for multi-agent workflows
Build production-ready AI agents with scalable long-term memory
ChatGPT Search and Perplexity on WhatsApp
Alibaba’s super agent app with MCP support on Android
Deep Research for GitHub repos powered by AI agent Devin
& so much more!
Read time: 3 mins
AI Tutorial
Charts, diagrams, and visual data in PDFs remain a massive blind spot for most RAG systems. While text-based RAG has become relatively straightforward to implement, extracting meaningful insights from visual elements requires specialized approaches that many developers struggle to implement efficiently. The standard workaround of OCR followed by text embedding loses crucial context and fails completely with complex visual elements.
In this tutorial, we'll build a cutting-edge Vision RAG system that uses Cohere's Embed-4 model to create unified vector representations that capture both visual and textual elements. Then, we'll use Google's Gemini 2.5 Flash to analyze these retrievals and generate comprehensive answers by fully understanding the visual context.
We share hands-on tutorials like this every week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.
Latest Developments
Sim Studio is an open-source drag-and-drop interface for building multi-agent AI workflows visually. Design complex agent systems with intuitive directed graphs where you connect models, tools, and logic controls as nodes. The visual interface isn't just for planning—it's fully executable, letting you test and refine workflows before deployment. With support for both cloud and local models via Ollama, you can develop offline and deploy to the cloud when ready.
You might be wondering how this differs from other visual agent workflow builders like n8n, Flowise, and RAGFlow, which we have been using already. Sim Studio keeps things closer to what LLM providers actually use, with fewer abstraction layers between you and the models. Instead of generic parameters like "memory," you directly control the system prompts and tool definitions that determine how your agents behave.
Key Highlights:
Provider-aligned interfaces - Work directly with the native parameters each LLM provider uses rather than through abstraction layers. This means you have full control over system prompts, tool definitions, and temperature settings exactly as they'll be executed in production.
Model switching - Test your workflows with different models from OpenAI, Anthropic, Llama, and others without rewriting your agent logic. You can also use different models within the same workflow, allowing you to optimize each step for cost or performance by selecting the right model for specific tasks.
Testing and observability - Run simulations of your workflows multiple times to see how changes affect performance. The platform includes detailed execution logs, latency tracing, and error analysis to help you quickly identify and fix issues.
Deployment options - Deploy workflows as APIs with HTTP endpoints, schedule them to run periodically, or set them up to respond to incoming webhooks. You can also create standalone chat interfaces with password or domain protection for controlled access.
The #1 AI Meeting Assistant
Still taking manual meeting notes in 2025? Let AI handle the tedious work so you can focus on the important stuff.
Fellow is the AI meeting assistant that:
✔️ Auto-joins your Zoom, Google Meet, and Teams calls to take notes for you.
✔️ Tracks action items and decisions so nothing falls through the cracks.
✔️ Answers questions about meetings and searches through your transcripts, like ChatGPT.
Try Fellow today and get unlimited AI meeting notes for 30 days.
AI agents today often forget important details once a conversation gets too long. Opensource memory layer for AI agents, Mem0, has built a memory system that tackles this issue by building a scalable long-term memory system that picks and stores only the key facts, instead of bloating the context window.
It’s not just faster — Mem0 achieves 26% higher accuracy than OpenAI’s memory feature on the LOCOMO benchmark while cutting token usage by 90%. Plus, Mem0 reduces latency by 91%, making it much more suitable for production-level AI agent deployments.
Key Highlights:
How It Works - Mem0 uses a two-phase pipeline that extracts key facts from conversations and intelligently updates the memory store. The system decides whether to add new memories, update existing ones, delete contradictions, or do nothing - keeping the memory coherent and non-redundant. A graph-enhanced version (Mem0ᵍ) captures even more complex relationships between conversation elements.
Performance Boost - Mem0 achieved 26% higher accuracy than OpenAI Memory on the LOCOMO benchmark, demonstrating superior reasoning about past conversations. The system consistently outperformed six leading memory approaches across all question types, from multi-hop to temporal reasoning.
Speed Optimization - By retrieving only the most relevant facts instead of entire conversation histories, Mem0 cuts latency by 91% compared to full-context methods. This means near-instant responses (0.71s median) even when reasoning about extensive conversation history.
Cost Efficiency - Mem0 uses approximately 1,800 tokens per conversation compared to 26,000 for full-context methods - a 90% reduction. This makes long-term memory practical and affordable at scale, without compromising on quality.
Quick Bites
ChatGPT Search (+1-800-242-8478) and Perplexity (+1-833-436-3285) are now available on WhatsApp. You can send questions using text or images, about old or recent events like live sports scores, and get accurate, up-to-date answers with sources. Perplexity also supports image generation directly on WhatsApp.
HyperBrowser has released HyperAgent, an open-source AI layer over Playwright that lets you automate the browser with natural language. The framework introduces two key functions: page.ai for running automation through simple commands like "Find a route from Miami to New Orleans," and page.extract for pulling structured data from web pages without custom selectors. Featuring built-in stealth browsing to prevent detection, HyperAgent also supports integration with external tools as an MCP client and can scale to hundreds of concurrent sessions through HyperBrowser.
Meta just launched the Llama API in preview at its first-ever LlamaCon. It offers easy API access, interactive playgrounds, and SDKs for Python and TypeScript to build with Llama models like Llama 4 Scout and Maverick. Developers can fine-tune models, run evaluations, and even use fast inference through Cerebras and Groq—all while keeping control of their weights and data.
Baidu has released "Xinxiang," a general-purpose super agent that can handle complex multi-step tasks across work, study, and daily life scenarios from a single prompt. It is based on a multi-agent system that can plan, collaborate, and execute tasks autonomously. Currently supporting over 200 task types ranging from routine work to legal consultations and homework, Xinxiang aims to expand to 100,000+ task types. Already available on Android, iOS app coming soon.
Tools of the Trade
DeepWiki: Cognition’s AI SWE agent Devin can now do Deep Research on any GitHub repo. Just go to the website or swap Github with deepwiki in the repo URL and chat with Devin to get in-depth answers. 30,000 repos already indexed. 100% free for opensource repos.
Flowcode: A visual programming tool that lets you build full backend logic with blocks instead of code, while still keeping features like loops, conditions, and concurrency. It runs inside VSCode, supports TypeScript, and works with your existing codebase.
Pocket Flow Tutorial: Turns GitHub repo into easy tutorials. It crawls the repos and builds a knowledge base from the code. It analyzes entire codebases to identify core abstractions and how they interact, and transforms complex code into beginner-friendly tutorials with clear visualizations.
Awesome LLM Apps: Build awesome LLM apps with RAG, AI agents, MCP, and more to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos, and automate complex work.
Hot Takes
Literally the only thing that matters about software quality is observability
If you can't see what's going wrong (or you just don't care), you can't fix it. ~
Matt PocockMuch of the AI industry is caught in a particularly toxic feedback loop rn.
Blindly chasing better human preference scores is to LLMs what chasing total watch time is to a social media algo. It's a recipe for manipulating users instead of providing genuine value to them.
There's a reason you don't find Claude at #1 on chat slop leaderboards. I hope the rest of the industry realizes this before users pay the price. ~
Alex Albert
That’s all for today! See you tomorrow with more such AI-filled content.
Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!
PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉
Reply