- unwind ai
- Posts
- Opensource Grok CLI Agent
Opensource Grok CLI Agent
PLUS: Multiple sub-agents in Claude Code, World's first Large Visual Memory Model
Today’s top AI Highlights:
Opensource AI agent that brings Grok into your Terminal
World’s first Large Visual Memory Model to give AI visual memory
Create multiple sub-agents in Claude Code
Lovable can now think, plan, and act like a senior engineer
Stop using OpenAI for web search; use this instead
& so much more!
Read time: 3 mins
AI Tutorial
Integrating travel services as a developer often means wrestling with a patchwork of inconsistent APIs. Each API - whether for maps, weather, bookings, or calendars - brings its own implementations, auth, and maintenance burdens. The travel industry's fragmented tech landscape creates unnecessary complexity that distracts from building great user experiences.
In this tutorial, we’ll build a multi-agent AI travel planner using MCP servers as universal connectors. By using MCP as a standardized layer, we can focus on creating intelligent agent behaviors rather than API-specific quirks. Our application will orchestrate specialized AI agents that handle different aspects of travel planning while using external services via MCP.
We'll use the Agno framework to create a team of specialized AI agents that collaborate to create comprehensive travel plans, with each agent handling a specific aspect of travel planning - maps, weather, accommodations, and calendar events.
We share hands-on tutorials like this every week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.
Latest Developments
The terminal is getting crowded with AI agents. OpenAI dropped Codex CLI, Anthropic shipped Claude Code, Google launched Gemini CLI, and Qwen released their own command-line coding tool.
And now Grok CLI joins the party.
This isn't an official xAI product but rather a community-built weekend project that brings Grok's conversational AI directly to your terminal. Built with zero dependencies on heavy LLM frameworks, it offers a hackable, opensource alternative that promises unfiltered access to Grok's capabilities through a beautiful terminal interface powered by Ink.
Key Highlights:
Framework-Free Architecture - Skips the heavy dependencies that slow down other CLI tools, delivering faster performance and reduced complexity for developers who want lightweight solutions.
Agentic File Ops and Tool Selection - The agent automatically detects when to view, create, or edit files or what tools to use, based on your prompt, handling complex multi-file operations seamlessly.
MCP Integration - Supports MCP servers for extending capabilities with tools like Linear and GitHub, making it easy to connect your entire development workflow.
Planning Mode - Features a dedicated planning interface that lets you review and approve multi-step operations before execution.
Project Memory - Use GROK.md files to create persistent project knowledge that remembers your preferences, coding standards, common commands, and other decisions.
Join 400,000+ executives and professionals who trust The AI Report for daily, practical AI updates.
Built for business—not engineers—this newsletter delivers expert prompts, real-world use cases, and decision-ready insights.
No hype. No jargon. Just results.
AI today can chat brilliantly, but videos are like a foreign language they are learning to speak.
Current LLMs like Gemini max out at processing about an hour of video before hallucinating wildly, while humans effortlessly recall decades of visual experiences with stunning detail.
Memories.ai has just launched the world's first Large Visual Memory Model, designed to give AI human-like visual memory capabilities. Founded by a former Meta researcher, the company has created a system that can compress, index, and retrieve video moments across virtually unlimited timeframes.
The technology works by mimicking human memory mechanisms through specialized models for query processing, retrieval, selection, and reconstruction, enabling AI to see and remember the world the way humans do.
Key Highlights:
Unlimited context processing - While even Gemini caps out at 1 hour of video, this model handles virtually unlimited video lengths with ultra-low hallucination rates, accurately analyzing over 10 hours of content like entire TV seasons.
Human-inspired architecture - Built using specialized models that mirror human memory processes including cue detection, coarse retrieval, fine-grained extraction, and memory reconstruction for more natural visual understanding.
Real-world ready - Currently deployed with security companies for threat detection, media teams for video library management, and marketing teams for trend analysis and influencer identification.
Performance leadership - Sets new benchmarks across all major video understanding tasks, including classification, retrieval, and question answering with precise millisecond-level timing accuracy.
Quick Bites
Claude Code can now create and use multiple specialized AI sub-agents for task-specific workflows. These custom AI agents can handle tasks, for example, a dedicated code reviewer, debugger, or data analyst, each with its own context window and tool permissions. You can delegate complex workflows to these specialized agents while keeping your main conversation focused on high-level strategy. The best part is they're reusable across projects and can be shared with your team for consistent workflows. Check out this tutorial on a deep research system using Claude Code sub-agents.
Apple just dropped FastVLM, a vision-language model that runs blazingly fast on your iPhone while actually seeing what's in high-resolution images. The hybrid encoder architecture processes visual data up to 85x faster than comparable models without the accuracy penalty that comes with speed optimizations. The model runs entirely on-device with near real-time performance on iPhone GPUs.
Lovable just upgraded from “ask-and-edit” to “think-decide-act.” It now comes with a new Agent Mode in beta. It can now reason through tasks, break problems down, explore your codebase, and execute with autonomy, slashing build errors by over 90%.
Developer-like workflow: Interprets requests, explores codebases, uncovers missing context, and auto-fixes issues
Real-time capabilities: Live web search for documentation, on-demand image creation, and codebase exploration
Cost structure: Simple requests under 1 credit, complex builds cost more based on actual resource usage
Tools of the Trade
Browser Use Search API: Stop using OpenAI for real-time web search - they index the web for fast yet inaccurate replies. This API enables AI agents to interact with live websites by clicking through menus, forms, and dynamic content, just like a human user. It fetches real-time data instead of relying on cached or surface-level search results.
NeuralAgent: Opensource personal AI assistant that gets things done. It lives on your desktop, types, clicks, navigates the browser, fills out forms, sends emails, and performs tasks autonomously using LLMs.
Deep Graph MCP: MCP server to analyze and visualize large codebases with any MCP client like Claude Code. It provides advanced code exploration tools like semantic search, dependency analysis, etc., which scale better than Claude's native repository search capabilities.
Claude Code Templates: CLI tool that automatically configures Claude Code for any project with framework-specific commands, automation hooks, and MCP server integrations. It includes a real-time analytics dashboard for monitoring Claude Code sessions and usage patterns.
Awesome LLM Apps: Build awesome LLM apps with RAG, AI agents, MCP, and more to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos, and automate complex work.
Hot Takes
Old inequality: "I have access to knowledge, you don't."
New inequality: "We both have infinite knowledge. I know what to do with it." ~
Emad MostaqueAnthropic can go another 10x in revenue if they just rename Claude Code -> Claude Agent and move it from the CLI to a Desktop app..
People just don't get how good this is 😅 ~
Nikunj Kothari
That’s all for today! See you tomorrow with more such AI-filled content.
Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!
PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉
Reply