unwind ai
Posts
Build an Agentic RAG App with Reasoning

Build an Agentic RAG App with Reasoning

Fully functional agentic RAG app with step-by-step instructions (100% opensource)

Shubham Saboo & Gargi Gupta
June 06, 2025

Traditional RAG has served us well, but it's becoming outdated for complex use cases. While vanilla RAG can retrieve and generate responses, agentic RAG adds a layer of intelligence and adaptability that transforms how we build AI applications. Also, most RAG implementations are still black boxes - you ask a question, get an answer, but have no idea how the system arrived at that conclusion.

In this tutorial, we'll build a multi-agent RAG system with transparent reasoning using Claude 4 Sonnet and OpenAI. You'll create a system where you can literally watch the AI agent think through problems, search for information, analyze results, and formulate answers - all in real-time.

Here’s what we’ll use to build this app:

Agno Framework - Multi-agent orchestration with built-in reasoning capabilities. Agno’s Reasoning Tool adds step-by-step analysis capabilities (think + analyze functions)
Claude 4 Sonnet - Advanced language model for reasoning and response generation
OpenAI - Embedding model for vector search and semantic matching
LanceDB - Vector database with hybrid search (keyword + semantic)
Streamlit - Interactive web interface

Don’t forget to share this tutorial on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to support us!

What We’re Building

This Streamlit application implements a sophisticated RAG system that demonstrates transparent AI reasoning. Unlike traditional RAG systems, this implementation shows you exactly how the agent thinks through problems, searches for information, and arrives at conclusions.

Features:

Agentic architecture with specialized reasoning capabilities
Real-time reasoning visualization - watch the agent think step-by-step
Interactive knowledge base management - add URLs and documents dynamically
Hybrid search combining keyword and semantic matching
Vector search using OpenAI embeddings for semantic matching
Source attribution with citations for transparency
Side-by-side reasoning and answer display for complete visibility

How The App Works

Knowledge Base Setup: Documents are loaded from URLs using WebBaseLoader, chunked and embedded using OpenAI’s embedding model, then stored in LanceDB with hybrid search capabilities combining both keyword and semantic matching.

Agent Processing: User queries trigger the agent's reasoning process, where ReasoningTools help the agent think step-by-step and analyze the results of tool calls. The agent searches the knowledge base for relevant information, and Claude 4 Sonnet generates comprehensive answers with proper citations.

UI Flow: Users enter API keys, add knowledge sources through the sidebar, ask questions in the main interface, and watch the reasoning process unfold in real-time alongside the final answer generation, complete with source citations for transparency.

Prerequisites

Before we begin, make sure you have the following:

Python installed on your machine (version 3.10 or higher is recommended)
Your Anthropic and OpenAI API key
A code editor of your choice (we recommend VS Code or PyCharm for their excellent Python support)
Basic familiarity with Python programming

Code Walkthrough

Setting Up the Environment

First, let's get our development environment ready:

Clone the GitHub repository:

git clone https://github.com/Shubhamsaboo/awesome-llm-apps.git

🌟 Don't forget to star the opensource repo to show your support.

Go to the agentic_rag_with_reasoning folder:

cd rag_tutorials/agentic_rag_with_reasoning

Install the required dependencies:

pip install -r requirements.txt

Grab your Anthropic and OpenAI API keys.

Creating the Streamlit App

Let’s create our app. Create a new file rag_reasoning_agent.py and add the following code:

Import necessary libraries:

import streamlit as st
from agno.agent import Agent, RunEvent
from agno.embedder.openai import OpenAIEmbedder
from agno.knowledge.url import UrlKnowledge
from agno.models.anthropic import Claude
from agno.tools.reasoning import ReasoningTools
from agno.vectordb.lancedb import LanceDb, SearchType
from dotenv import load_dotenv
import os

# Load environment variables
load_dotenv()

Set up Streamlit configuration and UI:

st.set_page_config(
    page_title="Agentic RAG with Reasoning",
    page_icon="🧐",
    layout="wide"
)

st.title("🧐 Agentic RAG with Reasoning")
st.markdown("""
This app demonstrates an AI agent that:
1. **Retrieves** relevant information from knowledge sources
2. **Reasons** through the information step-by-step  
3. **Answers** your questions with citations
""")

Create API key configuration:

st.subheader("🔑 API Keys")
col1, col2 = st.columns(2)

with col1:
    anthropic_key = st.text_input(
        "Anthropic API Key",
        type="password",
        value=os.getenv("ANTHROPIC_API_KEY", ""),
        help="Get your key from https://console.anthropic.com/"
    )

with col2:
    openai_key = st.text_input(
        "OpenAI API Key", 
        type="password",
        value=os.getenv("OPENAI_API_KEY", ""),
        help="Get your key from https://platform.openai.com/"
    )

Initialize the knowledge base with vector search:

@st.cache_resource(show_spinner="📚 Loading knowledge base...")
def load_knowledge() -> UrlKnowledge:
    kb = UrlKnowledge(
        urls=["https://docs.agno.com/introduction/agents.md"],
        vector_db=LanceDb(
            uri="tmp/lancedb",
            table_name="agno_docs", 
            search_type=SearchType.vector,
            embedder=OpenAIEmbedder(
                api_key=openai_key
            ),
        ),
    )
    kb.load(recreate=True)
    return kb

Create the reasoning agent:

@st.cache_resource(show_spinner="🤖 Loading agent...")
def load_agent(_kb: UrlKnowledge) -> Agent:
    return Agent(
        model=Claude(
            id="claude-sonnet-4-20250514",
            api_key=anthropic_key
        ),
        knowledge=_kb,
        search_knowledge=True,
        tools=[ReasoningTools(add_instructions=True)],
        instructions=[
            "Include sources in your response.",
            "Always search your knowledge before answering the question.",
        ],
        markdown=True,
    )

Implement dynamic knowledge management:

with st.sidebar:
    st.header("📚 Knowledge Sources")
    
    # Show current URLs
    st.write("**Current sources:**")
    for i, url in enumerate(knowledge.urls):
        st.text(f"{i+1}. {url}")
    
    # Add new URL functionality
    new_url = st.text_input("Add new URL")
    if st.button("➕ Add URL", type="primary"):
        if new_url:
            knowledge.urls.append(new_url)
            knowledge.load(recreate=False, upsert=True, skip_existing=True)
            st.success(f"✅ Added: {new_url}")

Create the query interface with real-time reasoning:

query = st.text_area("Your question:", value="What are Agents?", height=100)

if st.button("🚀 Get Answer with Reasoning", type="primary"):
    col1, col2 = st.columns([1, 1])
    
    with col1:
        st.markdown("### 🧠 Reasoning Process")
        reasoning_placeholder = st.empty()
        
    with col2:
        st.markdown("### 💡 Answer")
        answer_placeholder = st.empty()

Implement streaming reasoning and response:

reasoning_text = ""
answer_text = ""
citations = []

for chunk in agent.run(
    query,
    stream=True,
    show_full_reasoning=True,
    stream_intermediate_steps=True,
):
    # Update reasoning display
    if chunk.reasoning_content:
        reasoning_text = chunk.reasoning_content
        reasoning_placeholder.markdown(reasoning_text)
    
    # Update answer display  
    if chunk.content and chunk.event in {RunEvent.run_response, RunEvent.run_completed}:
        if isinstance(chunk.content, str):
            answer_text += chunk.content
            answer_placeholder.markdown(answer_text)
    
    # Collect citations
    if chunk.citations and chunk.citations.urls:
        citations = chunk.citations.urls

Running the App

With our code in place, it's time to launch the app.

In your terminal, navigate to the project folder, and run the following command:

streamlit run rag_reasoning_agent.py

Streamlit will provide a local URL (typically http://localhost:8501). Open your web browser and navigate to this URL to interact with your Agentic RAG with reasoning app.

Working Application Demo

Conclusion

You've just built an Agentic RAG system with Reasoning that allows an agent to think and analyze the query, as well as lets you peek inside the thought process. It is perfect for critical applications where explainability matters.

This setup can now be expanded further:

Adding document upload functionality: Allow users to upload PDFs, Word docs, and other file types directly
Implementing memory persistence: Store conversation history and learned preferences across sessions
Adding voice interaction: Enable voice input and speech output for hands-free querying
Integrating real-time data sources: Connect to APIs for live information like stock prices or news feeds

Keep experimenting with different configurations and features to build more sophisticated AI applications.

We share hands-on tutorials like this 2-3 times a week, to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

Don’t forget to share this tutorial on your social channels and tag Unwind AI (X, LinkedIn, Threads) to support us!

Reply

or to participate.