unwind ai
Posts
Vision RAG for Next-gen AI Apps

Vision RAG for Next-gen AI Apps

PLUS: ChatGPT Plus for $44 by 2029, Opensource NotebookLM built on Llama 3.1 405B

Shubham Saboo & Gargi Gupta
September 30, 2024

In partnership with

FREE AI & ChatGPT Masterclass to automate 50% of your workflow

More than 300 Million people use AI across the globe, but just the top 1% know the right ones for the right use-cases.

Join this free masterclass on AI tools that will teach you the 25 most useful AI tools on the internet – that too for $0 (they have 100 free seats only!)

Get it now for absolutely free! (for first 100 users only) 🎁

This masterclass will teach you how to:

Build business strategies & solve problems like a pro
Write content for emails, socials & more in minutes
Build AI assistants & custom bots in minutes
Research 10x faster, do more in less time & make your life easier

You’ll wish you knew about this FREE AI masterclass sooner 😉

Today’s top AI Highlights:

Supercharge your document search with Vision RAG
Build and deploy high-performance LLM apps with LightLLM
California governor vetoed the AI safety bill SB 1047
ChatGPT Plus subscription might cost $44 by 2029
Opensource AI-powered code editor forked from VS Code and Continue
Opensource NotebookLM built on Llama 3.1 405B

& so much more!

Read time: 3 mins

AI Tutorials

Meta’s new Llama 3.2 models are here, offering incredible advancements in speed and accuracy for their size. Do you want to fine-tune the models but are worried about the complexity and cost? Look no further!

In this blog post, we’ll walk you through finetuning Llama 3.2 models (1B and 3B) using Unsloth AI and Low-Rank Adaptation (LoRA) for efficient tuning in just 30 lines of Python code. You can use your own dataset.

With Unsloth, the process is faster than ever—2x faster, in fact. And the best part? You can finetune Llama 3.2 for free on Google Colab.

We share hands-on tutorials like this 2-3 times a week, designed to help you stay ahead in the world of AI. If you're serious about levelling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

Fine-tune Llama 3.2 for Free in 30 Lines of Python Code

Fine-tune Llama 3.2 in Google Colab for free (step-by-step instructions with Code)

www.theunwindai.com/p/fine-tune-llama-3-2-for-free-in-30-lines-of-python-code

🎁 Bonus worth $50 💵

Share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to get an AI resource pack worth $50 for FREE. Valid for a limited time only!

Latest Developments

Boost Your AI App with Vision-First Retrieval 👀

Need to quickly compare different Vision-based RAG approaches for your project? Check out VARAG, a new engine designed to evaluate and implement various Vision-Augmented RAG techniques. It supports simple OCR RAG, vision RAG, ColPali RAG, and Hybrid ColPali RAG, offering flexibility for diverse document types and use-cases. Built with developer usability in mind, VARAG streamlines the process of indexing and querying both textual and visual data using a unified API.

Key Highlights:

Compare 4 Vision RAG techniques - Evaluate Simple OCR, Vision RAG, ColPali, and Hybrid ColPali techniques and determine the best method for your specific document types and performance requirements. See how each handles different data formats and retrieval challenges.
Unified API - Index and query using a consistent set of methods, regardless of the chosen RAG technique. This abstracted implementation accelerates development and allows easy switching between methods with minimal code changes.
Ready-to-run demos and Colab notebooks - Quickly get started with interactive examples and demo scripts for each technique. Experiment with different datasets and query strategies, exploring the strengths and weaknesses of each approach in a hands-on environment.

Lightweight Framework, Heavyweight LLM Performance 🏋️‍♂️

Tired of slow and resource-intensive LLM inference? LightLLM, a new Python framework, offers a leaner, faster, and more scalable solution for deploying large language models. By combining asynchronous processing, dynamic batching, and advanced memory management, LightLLM maximizes hardware utilization and delivers significant performance gains. Whether you're working with a single GPU or a cluster, LightLLM makes running models like Llama, BLOOM, StarCoder, and many others, smoother and more efficient.

Key Highlights:

Blazing fast inference - Don't let inference bottlenecks slow you down. LightLLM's asynchronous pipeline and dynamic batching dramatically improve GPU utilization, leading to faster processing and lower latency. Independent benchmarks show over double the throughput compared to vLLM on LLaMA-7B with an A800 GPU.
Memory efficiency that matters - Run larger models and handle more concurrent requests without hitting memory limits. LightLLM's Token Attention and high-performance router minimize memory waste, allowing you to get the most out of your hardware. Plus, features like Int8KV Cache double the token capacity for supported models.
Seamless multimodal deployments - Working with image-based models like Qwen-VL and Llava? LightLLM simplifies multimodal deployments with streamlined image input handling and optimized caching. Deploy and query these models with ease, leveraging the same performance benefits as text-based LLMs.
Deployment made easy - Get your LLM serving API up and running quickly with LightLLM's simple and configurable setup. Pre-built Docker containers, helper scripts, and comprehensive documentation streamline the process, allowing you to focus on building your application, not wrestling with infrastructure.

Quick Bites

California Governor Gavin Newsom has vetoed the very controversial AI safety bill SB 1047, citing concerns that it could hinder innovation and give a false sense of security. Supporters argue the bill was needed for oversight, while critics, including major tech companies, warned it would stifle progress.

Boost your large model training speed by up to 39% and slash costs with Colossal-AI's latest upgrade. This release introduces a simplified FP8 mixed-precision training solution combining BF16 (O2) and FP8 (O1), delivering substantial performance gains without complex code changes. You can easily integrate this feature and explore the benchmark results across various LLMs.

Arcade, a new generative AI marketplace, launched in beta, allowing users to create and purchase custom products with just a few words or images. Co-founded by Mariam Naficy, the platform starts with jewelry and will expand into other categories, with designs handcrafted by artisans from a global marketplace.

OpenAI is reportedly planning to raise the price of ChatGPT Plus from $20 to $22 per month by the end of the year, with a further increase to $44 by 2029, according to internal documents. The company faces financial pressure despite reaching $300 million in monthly revenue, expecting a $5 billion loss this year.

Tools of the Trade

ChatMLX: Opensource app for MacOS to run LLMs locally and chat with them, in Vision OS style UI. It is based on the powerful performance of MLX and Apple silicon. It supports multiple models including Llama, OpenELM, Phi, and Qwen.

Open NotebookLM: Convert your PDFs into podcasts with open-source AI models (Llama 3.1 405B and MeloTTS). Only the text content of the PDF will be processed. Images and tables are not included. The PDF should be no more than 100K characters due to the context length of Llama 3.1 405B.
Pear AI: Opensource AI code editor built on a fork of VSCode, integrating AI models to boost development speed. It let you interact directly with your codebase and includes features like automated code generation and debugging tools.
Awesome LLM Apps: Build awesome LLM apps using RAG to interact with data sources like GitHub, Gmail, PDFs, and YouTube videos through simple text. These apps will let you retrieve information, engage in chat, and extract insights directly from content on these platforms.

Hot Takes

OpenAI alumni have created its biggest competitors
- Anthropic is a spin off and makes the best LLMs today
- SSI is stealing OAI talent and is taking a straight shot to super intelligence
- Grok has already caught up to SOTA and is training GROK-3
- Mira says she wants to do our own exploration
- Karpathy has his own thing
It feels like we will have a dozen more startups started by OAI folks as VCs scramble to hire and fund them! ~
Bindu Reddy
tfw it's easier to create software than to google it ~
Guillermo Rauch

Meme of the Day

Meet Sam Altman’s new life coach
— Jason (@mytechceoo)
1:19 PM • Sep 27, 2024

That’s all for today! See you tomorrow with more such AI-filled content.

🎁 Bonus worth $50 💵

Share this newsletter on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to get AI resource pack worth $50 for FREE. Valid for limited time only!

Unwind AI - Twitter | LinkedIn | Threads | Facebook

Awesome LLM Apps | Sponsor Us

PS: We curate this AI newsletter every day for FREE, your support is what keeps us going. If you find value in what you read, share it with at least one, two (or 20) of your friends 😉

Reply

or to participate.