- unwind ai
- Posts
- xAI's Grok 4 is the Smartest AI in the World
xAI's Grok 4 is the Smartest AI in the World
PLUS: Perplexity releases Comet AI browser, Hugging Face's $299 opensource robot
Today’s top AI Highlights:
Perplexity Comet makes AI agents your new gateway to the internet
xAI’s Grok 4 models make OpenAI o3, Claude 4 Opus, and Gemini 2.5 Pro look a decade old
Anthropic just dropped free courses on API, Claude Code, and MCP
Hugging Face releases $299 opensource, fully hackable desktop robot
Make AI agents use your MacBook apps autonomously
& so much more!
Read time: 3 mins
AI Tutorial
Business consulting has always required deep market knowledge, strategic thinking, and the ability to synthesize complex information into actionable recommendations. Today's fast-paced business environment demands even more - real-time insights, data-driven strategies, and rapid response to market changes.
In this tutorial, we'll create a powerful AI business consultant using Google's Agent Development Kit (ADK) combined with Perplexity AI for real-time web research. This consultant will conduct market analysis, assess risks, and generate strategic recommendations backed by current data, all through a clean, interactive web interface.
We share hands-on tutorials like this every week, designed to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.
Latest Developments
The internet is getting redesigned not for humans, but for AI agents.
Agentic AI browsers like Dia, Fellou, and Genspark AI browsers show how AI agents are now doing all the web tasks for you - autonomously navigating websites and executing workflows.
Now, Perplexity dropped Comet, a browser that doesn't just browse but actively thinks alongside you. Built as a "thought partner" for your entire digital life, Comet transforms scattered tabs and endless clicking into fluid conversations where AI handles complete workflows while you focus on what actually matters.
You're no longer managing tabs and applications; you're collaborating with an AI that treats the internet as your extended mind.
Key Highlights:
Conversation-Driven - Instead of manually booking meetings or comparing products across multiple tabs, you simply tell Comet what you need, and it executes complete browsing sessions autonomously. Ask it to "find sites with the same bike that ship faster" or "brief me for my day".
Perplexity Search - Comet ships with Perplexity's AI search as the default engine, replacing traditional blue links with AI-generated summaries and accurate answers. Every webpage becomes a portal of curiosity where highlighting text provides instant explanations and questions get contextual responses.
Persistent AI Assistant - Comet Assistant lives in every webpage through a sidecar interface, understanding what you're viewing, be it YouTube videos, web pages, Slack chats, Maps, anything. The more you use it, the better it learns how you think and work.
Access - Currently available to Perplexity Max subscribers ($200/month) with invite-only expansion planned. Comet aims for "infinite retention" by becoming your default gateway to an AI-first internet.
Inventory Software Made Easy—Now $499 Off
Looking for inventory software that’s actually easy to use?
inFlow helps you manage inventory, orders, and shipping—without the hassle.
It includes built-in barcode scanning to facilitate picking, packing, and stock counts. inFlow also integrates seamlessly with Shopify, Amazon, QuickBooks, UPS, and over 90 other apps you already use
93% of users say inFlow is easy to use—and now you can see for yourself.
Try it free and for a limited time, save $499 with code EASY499 when you upgrade.
Free up hours each week—so you can focus more on growing your business.
✅ Hear from real users in our case studies
🚀 Compare plans on our pricing page
Elon Musk’s xAI literally dropped the smartest AI in the world. And while this sounds like typical Musk hyperbole, the benchmarks are backing up the bold statement. The company released two models - Grok 4 and Grok 4 Heavy - both reasoning models, with the Heavy version cranking up test-time compute for deeper reasoning.
The benchmark numbers feel almost surreal. The performance gaps aren't marginal improvements. They're the kind of jumps that make you wonder if we're witnessing a genuine capability leap. Grok 4 obliterated previous records on ARC-AGI, scoring so high it makes OpenAI's o3, Claude Opus 4, and Gemini 2.5 Pro look pedestrian by comparison.
On Humanity's Last Exam, Grok 4 Heavy achieves an unprecedented score of 44%. Grok 4, even without tool-use, matches the performance of OpenAI's Deep Research with o3-mini and China's top Kimi Researcher system.
The pricing structure tells its own story about where the industry is heading. xAI has released a new subscription tier, SuperGrok Heavy, that costs $300 a month, higher than any Pro/Max subscriptions in this space.
Key Highlights:
Model Features - Both models are multimodal reasoning models with native tool use and a context window of 256K tokens. The models also integrate with X to search for real-time information on the platform and respond.
Benchmark dominance - Achieved all-time highs across GPQA Diamond (88%), AIME 2024 (94%), and MMLU-Pro (87%), outperforming state-of-the-art models like o3-pro, Gemini 2.5 Pro, and Claude 4 Opus with a very high margin.
Independent Testing - xAI gave Artificial Analysis and Andon Labs (Vending Bench) team early access to Grok 4 for independent testing. Artificial Analysis gave Grok 4 an Intelligence Index of 73, ahead of o3 (70), Gemini 2.5 Pro (70), and Claude 4 Opus (64).
To assess the model’s capabilties in complex, long-horizon tasks, the model was tested on a real-world Vending machine business where it generated a profit of $4700, that’s double the previous highest of Claude 4 Opus, and around 6x higher than humans.Availability - Grok 4 is now available to SuperGrok users and via API. Grok 4’s pricing is equivalent to Grok 3 at $3/$15 per 1M input/output tokens. The per-token pricing is identical to Claude 4 Sonnet, but more expensive than Gemini 2.5 Pro and OpenAI o3.
Grok 4 Heavy is currently only available to SuperGrok Heavy users.
xAI isn't stopping here. Their next 3 months include a specialized coding model in August, a multimodal agent in September (they didn’t expand on what this agent is, but we’re assuming it’d be a Superagent like Manus AI and Genspark powered by Grok 4), and video generation capabilities by October.
Watch Grok 4 build an AI Investment Agent team just by going through a documentation. Replit lets us run, test, and deploy that agent team directly from the browser - all of this in less than 2 mins.
We will be sharing more tutorials on Unwind AI and in the Awesome LLM Apps repo using Grok 4. Stay tuned!
Quick Bites
While everyone's building AI apps, Anthropic decided to teach you how to actually build them properly. Anthropic has launched a free educational platform with 6 courses covering everything from API fundamentals to advanced Model Context Protocol implementation. The courses feature dozens of lectures, self-guided quizzes, and shareable certificates, all built with input from developers already using Claude in production environments. Check them out, they are completely free!
Hugging Face just dropped their first robot, and it's not what you'd expect from the model hub giants. Meet Reachy Mini, a $299 opensource stuffed toy-sized desktop robot, fully programmable in Python, complete with expressive movements and multimodal capabilties. It comes with 15+ pre-built behaviors and connects directly to Hugging Face's model ecosystem. Think of it as a hardware extension of the Hugging Face ecosystem, where you can upload, share, and download robot behaviors just like you would with models, but now they move around your desk.
Just when everyone assumed decoder-only models had won the architecture wars, Google reminded us why encoder-decoder setups dominated NLP for years. Google’s new T5Gemma models are built by converting pretrained Gemma 2 models into encoder-decoder format, creating models that consistently outperform their decoder-only counterparts on the quality-efficiency frontier. The standout feature is the flexibility to mix encoder and decoder sizes - like pairing a 9B encoder with a 2B decoder - allowing you to fine-tune the performance-speed trade-off for specific tasks. Models are available to download on Hugging Face and Kaggle.
We usually don't cover funding and valuation news, but LangChain hitting unicorn status feels like watching the opensource darling finally cash in on developer love. The startup that once solved LLMs' inability to access real-time data is now raising at a $1 billion valuation from IVP, despite OpenAI and Anthropic building similar capabilities directly into their APIs. Their pivot to LangSmith for LLM observability is apparently printing money at $12-16M ARR.
Tools of the Trade
MacOS-use: Give AI agents control over your Mac applications by tapping into the accessibility tree that every app exposes. It's like Browser Use but for your entire desktop - agents can click, type, and navigate through any Mac software using simple prompts.
Lens AI: Turns your phone into an AI hardware assistant for field operations. Upload your schematics and datasheets, then point your camera at any component to get instant technical answers from your own documentation. Especially made for hardware and manufacturing engineers.
Dev Atrophy Test: Can you still code without AI? Find out if you're still a dev lord or actually just larping. It's a test of your core web dev knowledge — no handholding, no back rubs, no AI autocomplete. Just you, your brain, and 10 questions. There are 3 levels (Noobie, Le Chad, Hardcore), and the questions cover HTML, CSS, JavaScript, databases, and Node.
Awesome LLM Apps: Build awesome LLM apps with RAG, AI agents, MCP, and more to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos, and automate complex work.
Hot Takes
yeah, most people on this planet think Claude is a french guy sipping espresso with a croissant, not an LLM.
good reminder we're ridiculously early. ~
Greg IsenbergYou can cut & paste your entire source code file into the query entry box on grok.com and Grok 4 will fix it for you!
This is what everyone at xAI does. Works better than Cursor. ~
That’s all for today! See you tomorrow with more such AI-filled content.
Don’t forget to share this newsletter on your social channels and tag Unwind AI to support us!
PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉
Reply