unwind ai
Posts
Meta's Opensource NotebookLM

Meta's Opensource NotebookLM

PLUS: Test Computer Use in isolated sandbox, Gemini 2.0 coming in December

Shubham Saboo & Gargi Gupta
October 28, 2024

Today’s top AI Highlights:

Test and run LLMs with Computer Use in isolated sandbox
Meta releases NotebookLlama - Opensource version of Google NotebookLM
Google to release Gemini 2.0 and its own Computer Use in December
First-of-its-kind robotic Torso powered by artificial muscles
Use Anthropic Computer Use on Mac and Windows with a single command

& so much more!

Read time: 3 mins

AI Tutorials

AI tools are changing how we handle financial data, and building a team of AI agents that can act as financial analysts makes it even better.

This guide shows you how to set up a multi-agent financial analyst system using GPT-4o in just 20 lines of Python code. The system integrates:
•A web agent for general internet research
•A finance agent for detailed financial analysis
•A team agent for coordinating between agents
working together to deliver meaningful financial insights quickly.

We are using Phidata, a framework designed for building agent-based systems to streamline the entire setup.

We share hands-on tutorials like this 2-3 times a week, designed to help you stay ahead in the world of AI. If you're serious about levelling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

Build a Team of AI Agents to Create an AI Financial Analyst

Multi-agent app with web access in just 20 lines of Python Code (step-by-step instructions)

🎁 Bonus worth $50 💵

Share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to get an AI resource pack worth $50 for FREE. Valid for a limited time only!

Latest Developments

Secure and Customizable Computer Use for LLMs 🖥️

E2B.dev has introduced the Desktop Sandbox (beta), a cloud-based isolated environment to let your LLMs interact with a familiar desktop GUI. You can customize the environment, integrate it with various tools, and run long sessions without cold starts. This sandbox is optimized for secure “Computer Use,” similar to Anthropic’s. It offers full control over the filesystem, supports programmatic actions like keyboard/mouse input, and enables running commands directly inside the sandbox.

Key Highlights:

Secure and Isolated - Each LLM interacts with the GUI within its own secure sandbox, preventing interference and ensuring safe execution of dynamically generated code. This is crucial for protecting your systems when working with untrusted or experimental AI code.
Rapid Startup - Sandboxes launch in just 300-500ms, eliminating frustrating wait times and enabling near real-time interaction with desktop applications. This rapid initialization is perfect for interactive AI applications or automated tasks requiring quick responses.
Programmatic Control with Python SDK - The provided Python SDK gives you granular control over mouse actions, keyboard input, screen interaction, and file system operations within the sandbox. This allows for complex automation sequences and integration with your existing Python codebase.
Customizable Environments - Tailor the sandbox to your exact needs by installing required third-party libraries and configuring the desktop environment. This flexibility ensures compatibility with your specific applications and workflows.

Meta’s Opensource Recipe for Google NotebookLM 📖🎙️

Google’s NotebookLM is all the rage on social media and many say that it is one of the most revolutionary AI products Google has ever released. Its ability to transform documents into engaging podcast-style audio has captured everyone's attention.

But NotebookLM being closed source didn’t give us any scope to tinker with it. The good news is, there's now an opensource recipe to replicate its core functionality and even build upon it: introducing NotebookLlama. Using Meta's Llama 3 models and opensource text-to-speech tools, NotebookLlama provides a flexible and customizable framework for generating podcast-style audio from PDFs.

Key Highlights:

Modular and Customizable Pipeline - The workflow is broken down into 4 distinct stages: PDF pre-processing, transcript generation, transcript enhancement (dramatization), and text-to-speech. This modularity allows you to isolate and optimize individual components.
Experiment with Llama Models - Explore the trade-offs between resource usage and output quality by using different Llama models, ranging from the powerful 70B model to lighter-weight alternatives suitable for resource-constrained environments.
Customization with Prompts - Tailor the generated audio's tone, style, and level of detail by adjusting the prompts at each stage. The project provides sample prompts as a starting point for experimentation, enabling quick customization and iterative refinement.
Extend with More Features - NotebookLlama is a great starting point to experiment and expand further. You can try advanced TTS models, or have multiple AI agents debate scenarios, or add support for diverse input formats beyond PDFs (like web pages and audio files).

Quick Bites

It seems OpenA is not the only one gearing up to release a new model this December. Google is also reportedly planning to release Gemini 2.0 in December. Though the model would have new capabilities, it is currently not showing the performance gains that the Google DeepMind CEO Demis Hassabis-led team hoped for.

Not just this, Google is also planning to release its own version of Computer Use that’ll be powered by Gemini 2.0. Named Project Jarvis, it would carry tasks out for users, like “gathering research, purchasing a product, or booking a flight.” Just like Claude, Gemini 2.0 will also capture screenshots of the UI to interpret and take action.

Meta has made its first news publication partnership with Reuters to integrate real-time news updates into its AI chatbot, offering U.S. users news summaries and links to Reuters content across Facebook, Instagram, WhatsApp, and Messenger.

You can now use Anthropic’s Computer User very easily with just a simple command with Open Interpreter. Just use
pip install open-interpreter
interpreter --os
to let Claude control your computer and operate as an autonomous AI Agent. It works on both Mac and Windows.

Here’s an incredibly life-like robot Torso, first of its kind, powered by artificial muscles. AI robotics startup Clone Robotics has unveiled "Torso," a bimanual android powered mimicking human-like arm and shoulder movements. This marks the first bimanual Torso featuring an actuated elbow, neck, and complex shoulder joints. The company is currently training Torso for coordinated two-armed manipulation tasks.

Tools of the Trade

Computer Use (for Mac): This fork of Anthropic Computer Use allows native macOS interaction without Docker. It allows GUI interaction, keyboard and mouse control, screen capture, and multi-LLM support via a Streamlit.
ask.py: A single Python program to implement the search-extract-summarize flow, similar to AI search engines such as Perplexity. It supports hybrid search, content filtering, and custom query options.
expand.ai: Converts any website into a type-safe API for instant access to structured data with customizable schemas. It manages scraping infrastructure, browser handling, and data validation
Awesome LLM Apps: Build awesome LLM apps using RAG to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos through simple text. These apps will let you retrieve information, engage in chat, and extract insights directly from content on these platforms.

Hot Takes

Steve Jobs would have used Claude. ~
Pietro Schirano
Most people think AI safety is about preventing Skynet. They're fighting yesterday's war.
My conversations with AI researchers confirm we've solved the extinction risk. The safety movement succeeded beyond our hopes.
The real threat is economic: Moore's Law for AI compute has accelerated for 12 decades straight. Some people keep claiming we have 80 years. Data says 20, max.
AI and robots will become the most valuable asset class. When capitalists own all automated production, the social contract breaks. ~
David Shapiro

Meme of the Day

AI founders after laying off their AI safety team
— Jason (@mytechceoo)
8:55 PM • Oct 26, 2024

That’s all for today! See you tomorrow with more such AI-filled content.

🎁 Bonus worth $50 💵

Share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to get AI resource pack worth $50 for FREE. Valid for a limited time only!

Unwind AI - X | LinkedIn | Threads | Facebook

Awesome LLM Apps | Sponsor Us

PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉

Reply

or to participate.