• unwind ai
  • Posts
  • Build an AI Startup Insight Agent with FIRE-1

Build an AI Startup Insight Agent with FIRE-1

Fully functional agent app with step-by-step instructions (100% opensource)

While working with web data, we keep facing the challenge of extracting structured information from dynamic, modern websites. Traditional scraping methods often break when coming across JavaScript-heavy interfaces, login requirements, and interactive elements - leading to brittle solutions that require constant maintenance.

In this tutorial, we're building an AI Startup Insight application that uses Firecrawl's FIRE-1 agent for robust web extraction. FIRE-1 is an AI agent that can autonomously perform browser actions - clicking buttons, filling forms, navigating pagination, and interacting with dynamic content - while understanding the semantic context of what it's extracting. We'll combine this with OpenAI's GPT-4o to create a complete pipeline from data extraction to analysis in a clean Streamlit interface. Weโ€™ll use Agno framework to build our AI startup insight agent.

The FIRE-1 agent solves a key developer pain point: instead of writing custom selectors and JavaScript handlers for each website, you can simply define the data schema you want and provide natural language instructions. The agent handles the complexities of web navigation and extraction, dramatically reducing development time and maintenance overhead.

Donโ€™t forget to share this tutorial on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to support us!

What Weโ€™re Building

An advanced web extraction and analysis tool built using Firecrawl's FIRE-1 agent + extract v1 endpoint and the Agno Agent framework to get details of a new startup instantly! This application automatically extracts structured data from startup websites and provides AI-powered business analysis, making it easy to gather insights about companies without manual research.

Features

  • ๐ŸŒ Intelligent Web Extraction:

    • Extract structured data from any company website

    • Automatically identify company information, mission, and product features

    • Process multiple websites in sequence

  • ๐Ÿ” Advanced Web Navigation:

    • Interact with buttons, links, and dynamic elements

    • Handle pagination and multi-step processes

    • Access information across multiple pages

  • ๐Ÿง  AI Business Analysis:

    • Generate insightful summaries of extracted company data

    • Identify unique value propositions and market opportunities

    • Provide actionable business intelligence

  • ๐Ÿ“Š Structured Data Output:

    • Organize information in a consistent JSON schema

    • Extract company name, description, mission, and product features

    • Standardize output for further processing

  • ๐ŸŽฏ Interactive UI:

    • User-friendly Streamlit interface

    • Process multiple URLs in parallel

    • Clear presentation of extracted data and analysis

How The App Works

The application workflow follows these steps:

  1. URL Collection: The user enters one or more company website URLs in the text area.

  2. FIRE-1 Extraction: When the user clicks "Start Analysis," Firecrawl's FIRE-1 agent is deployed with:

    • A detailed prompt that guides the agent on what information to extract

    • A JSON schema that defines the structure of the output data

    • Parameters that configure the agent's behavior

  3. Intelligent Navigation: Unlike traditional scrapers that only read static HTML, the FIRE-1 agent:

    • Actively clicks buttons and interacts with dynamic elements

    • Navigates through multiple pages when necessary

    • Uses AI reasoning to identify the most relevant content

  4. Data Structuring: The agent automatically converts unstructured web content into a JSON object according to our schema.

  5. AI Analysis: The Agno Agent (powered by GPT-4o) processes the structured data to generate business insights.

  6. Data Presentation: Results are displayed in a tabbed interface for easy comparison of multiple companies.

Prerequisites

Before we begin, make sure you have the following:

  1. Python installed on your machine (version 3.10 or higher is recommended)

  2. Your OpenAI and Firecrawl API keys

  3. A code editor of your choice (we recommend VS Code or PyCharm for their excellent Python support)

  4. Basic familiarity with Python programming

Code Walkthrough

Setting Up the Environment

First, let's get our development environment ready:

  1. Clone the GitHub repository:

git clone https://github.com/Shubhamsaboo/awesome-llm-apps.git
cd advanced_ai_agents/single_agent_apps/ai_startup_insight_fire1_agent
pip install -r requirements.txt
  1. Grab your API Keys:

Creating the Streamlit App

Letโ€™s create our app. Create a new file ai_startup_insight_fire1_agent.py and add the following code:

  1. Import necessary libraries:

from firecrawl import FirecrawlApp
import streamlit as st
import os
import json
from agno.agent import Agent
from agno.models.openai import OpenAIChat
  1. Set up the Streamlit page configuration:

st.set_page_config(
    page_title="Startup Info Extraction",
    page_icon="๐Ÿ”",
    layout="wide"
)
st.title("AI Startup Insight with Firecrawl's FIRE-1 Agent")
  1. Create a sidebar for API key configuration:

with st.sidebar:
    st.header("API Configuration")
    firecrawl_api_key = st.text_input("Firecrawl API Key", type="password")
    openai_api_key = st.text_input("OpenAI API Key", type="password")
    st.caption("Your API keys are securely stored and not shared.")
    
    st.markdown("---")
    st.markdown("### About")
    st.markdown("This tool extracts company information from websites using Firecrawl's FIRE-1 agent and provides AI-powered business analysis.")
    
    st.markdown("### How It Works")
    st.markdown("1. ๐Ÿ” **FIRE-1 Agent** extracts structured data from websites")
    st.markdown("2. ๐Ÿง  **Agno Agent** analyzes the data for business insights")
    st.markdown("3. ๐Ÿ“Š **Results** are presented in an organized format")
  1. Create the main content area with information about Firecrawl capabilities:

st.markdown("## ๐Ÿ”ฅ Firecrawl FIRE-1 Agent Capabilities")
col1, col2 = st.columns(2)

with col1:
    st.info("**Advanced Web Extraction**\n\nFirecrawl's FIRE-1 agent combined with the extract endpoint can intelligently navigate websites to extract structured data, even from complex layouts and dynamic content.")
    st.success("**Interactive Navigation**\n\nThe agent can interact with buttons, links, input fields, and other dynamic elements to access hidden information.")

with col2:
    st.warning("**Multi-page Processing**\n\nFIRE can handle pagination and multi-step processes, allowing it to gather comprehensive data across entire websites.")
    st.error("**Intelligent Data Structuring**\n\nThe agent automatically structures extracted information according to your specified schema, making it immediately usable.")

st.markdown("---")
st.markdown("### ๐ŸŒ Enter Website URLs")
st.markdown("Provide one or more company website URLs (one per line) to extract information.")
website_urls = st.text_area("Website URLs (one per line)", placeholder="https://example.com\nhttps://another-company.com")
  1. Define the extraction schema for structured data:

extraction_schema = {
    "type": "object",
    "properties": {
        "company_name": {
            "type": "string",
            "description": "The official name of the company or startup"
        },
        "company_description": {
            "type": "string",
            "description": "A description of what the company does and its value proposition"
        },
        "company_mission": {
            "type": "string",
            "description": "The company's mission statement or purpose"
        },
        "product_features": {
            "type": "array",
            "items": {
                "type": "string"
            },
            "description": "Key features or capabilities of the company's products/services"
        },
        "contact_phone": {
            "type": "string",
            "description": "Company's contact phone number if available"
        }
    },
    "required": ["company_name", "company_description", "product_features"]
}
  1. Add custom CSS for a better UI:

st.markdown("""
<style>
.stButton button {
    background-color: #FF4B4B;
    color: white;
    font-weight: bold;
    border-radius: 10px;
    padding: 0.5rem 1rem;
    border: none;
    box-shadow: 0 4px 6px rgba(0, 0, 0, 0.1);
    transition: all 0.3s ease;
}
.stButton button:hover {
    background-color: #FF2B2B;
    box-shadow: 0 6px 8px rgba(0, 0, 0, 0.15);
    transform: translateY(-2px);
}
.css-1r6slb0 {
    border-radius: 10px;
    box-shadow: 0 4px 6px rgba(0, 0, 0, 0.1);
}
</style>
""", unsafe_allow_html=True)
  1. Implement the core extraction and analysis logic:

if st.button("๐Ÿš€ Start Analysis", type="primary"):
    if not website_urls.strip():
        st.error("Please enter at least one website URL")
    else:
        try:
            with st.spinner("Extracting information from website..."):
                # Initialize the FirecrawlApp with the API key
                app = FirecrawlApp(api_key=firecrawl_api_key)
                
                # Parse the input URLs more robustly
                urls = [url.strip() for url in website_urls.split('\n') if url.strip()]
                
                # Debug: Show the parsed URLs
                st.info(f"Attempting to process these URLs: {urls}")
                
                if not urls:
                    st.error("No valid URLs found after parsing. Please check your input.")
                elif not openai_api_key:
                    st.warning("Please provide an OpenAI API key in the sidebar to get AI analysis.")
                else:
                    # Create tabs for each URL
                    tabs = st.tabs([f"Website {i+1}: {url}" for i, url in enumerate(urls)])
                    
                    # Initialize the Agno agent once (outside the loop)
                    if openai_api_key:
                        agno_agent = Agent(
                            model=OpenAIChat(id="gpt-4o", api_key=openai_api_key),
                            instructions="""You are an expert business analyst who provides concise, insightful summaries of companies.
                            
You will be given structured data about a company including its name, description, mission, and product features.

Your task is to analyze this information and provide a brief, compelling summary that highlights:

1. What makes this company unique or innovative
2. The core value proposition for customers
3. The potential market impact or growth opportunities

Keep your response under 150 words, be specific, and focus on actionable insights.""",
                            markdown=True
                        )
                    
                    # Process each URL one at a time
                    for i, (url, tab) in enumerate(zip(urls, tabs)):
                        with tab:
                            st.markdown(f"### ๐Ÿ” Analyzing: {url}")
                            st.markdown("<hr style='border: 2px solid #FF4B4B; border-radius: 5px;'>", unsafe_allow_html=True)
                            
                            with st.spinner(f"FIRE agent is extracting information from {url}..."):
                                try:
                                    # Extract data for this single URL
                                    data = app.extract(
                                        [url],  # Pass as a list with a single URL
                                        params={
                                            'prompt': '''
Analyze this company website thoroughly and extract comprehensive information.

1. Company Information:
- Identify the official company name
Explain: This is the legal name the company operates under.
- Extract a detailed yet concise description of what the company does
- Find the company's mission statement or purpose
Explain: What problem is the company trying to solve? How do they aim to make a difference?

2. Product/Service Information:
- Identify 3-5 specific product features or service offerings
Explain: What are the key things their product or service can do? Describe as if explaining to a non-expert.
- Focus on concrete capabilities rather than marketing claims
Explain: What does the product actually do, in simple terms, rather than how it's advertised?
- Be specific about what the product/service actually does
Explain: Give examples of how a customer might use this product or service in their daily life.

3. Contact Information:
- Find direct contact methods (phone numbers)
Explain: How can a potential customer reach out to speak with someone at the company?
- Only extract contact information that is explicitly provided
Explain: We're looking for official contact details, not inferring or guessing.

Important guidelines:
- Be thorough but concise in your descriptions
- Extract factual information, not marketing language
- If information is not available, do not make assumptions
- For each piece of information, provide a brief, simple explanation of what it means and why it's important
- Include a layman's explanation of what the company does, as if explaining to someone with no prior knowledge of the industry or technology involved
''',
                                            'schema': extraction_schema,
                                            'agent': {"model": "FIRE-1"}
                                        }
                                    )
                                    
                                    # Check if extraction was successful
                                    if data and data.get('data'):
                                        # Display extracted data
                                        st.subheader("๐Ÿ“Š Extracted Information")
                                        company_data = data.get('data')
                                        
                                        # Display company name prominently
                                        if 'company_name' in company_data:
                                            st.markdown(f"## {company_data['company_name']}")
                                        
                                        # Display other extracted fields
                                        for key, value in company_data.items():
                                            if key == 'company_name':
                                                continue  # Already displayed above
                                            
                                            display_key = key.replace('_', ' ').capitalize()
                                            if value:  # Only display if there's a value
                                                if isinstance(value, list):
                                                    st.markdown(f"**{display_key}:**")
                                                    for item in value:
                                                        st.markdown(f"- {item}")
                                                elif isinstance(value, str):
                                                    st.markdown(f"**{display_key}:** {value}")
                                                elif isinstance(value, bool):
                                                    st.markdown(f"**{display_key}:** {str(value)}")
                                                else:
                                                    st.write(f"**{display_key}:**", value)
                                        
                                        # Process with Agno agent
                                        if openai_api_key:
                                            with st.spinner("Generating AI analysis..."):
                                                # Run the agent with the extracted data
                                                agent_response = agno_agent.run(f"Analyze this company data and provide insights: {json.dumps(company_data)}")
                                                
                                                # Display the agent's analysis in a highlighted box
                                                st.subheader("๐Ÿง  AI Business Analysis")
                                                st.markdown(agent_response.content)
                                        
                                        # Show raw data in expander
                                        with st.expander("๐Ÿ” View Raw API Response"):
                                            st.json(data)
                                        
                                        # Add processing details
                                        with st.expander("โ„น๏ธ Processing Details"):
                                            st.markdown("**FIRE Agent Actions:**")
                                            st.markdown("- ๐Ÿ” Scanned website content and structure")
                                            st.markdown("- ๐Ÿ–ฑ๏ธ Interacted with necessary page elements")
                                            st.markdown("- ๐Ÿ“Š Extracted and structured data according to schema")
                                            st.markdown("- ๐Ÿง  Applied AI reasoning to identify relevant information")
                                            
                                            if 'status' in data:
                                                st.markdown(f"**Status:** {data['status']}")
                                            if 'expiresAt' in data:
                                                st.markdown(f"**Data Expires:** {data['expiresAt']}")
                                    else:
                                        st.error(f"No data was extracted from {url}. The website might be inaccessible, or the content structure may not match the expected format.")
                                
                                except Exception as e:
                                    st.error(f"Error processing {url}: {str(e)}")
        except Exception as e:
            st.error(f"Error during extraction: {str(e)}")

Running the App

With our code in place, it's time to launch the app.

  • In your terminal, navigate to the project folder, and run the following command

streamlit run ai_startup_insight_fire1_agent.py
  • Streamlit will provide a local URL (typically http://localhost:8501).

  • To use the app:

    1. Enter your API keys in the sidebar:

      • Firecrawl API key

      • OpenAI API key

    2. In the main area, enter company website URLs (one per line). Try these examples:

    3. Click "๐Ÿš€ Start Analysis" to begin the extraction and analysis process.

    4. The FIRE-1 agent will process each URL, extracting structured information according to our schema.

    5. The app will display the extracted data and AI-generated business analysis in the tabbed interface.

Working Application Demo

Conclusion

You've successfully built a powerful AI Startup Insight tool that combines advanced web extraction with business intelligence. This application demonstrates how AI agents can automate complex workflows that would typically require significant custom development.

To enhance this project further, consider:

  1. Scheduled Data Collection: Add functionality to periodically scan competitor websites for changes.

  2. Data Comparison Tools: Implement visual comparisons between multiple startups in the same space.

  3. Export Capabilities: Add options to export the structured data to CSV, Excel, or directly to a CRM.

  4. Historical Tracking: Store extracted data in a database to track how company information changes over time.

Keep experimenting with different configurations and features to build more sophisticated AI applications.

We share hands-on tutorials like this 2-3 times a week, to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.

Donโ€™t forget to share this tutorial on your social channels and tag Unwind AI (X, LinkedIn, Threads, Facebook) to support us!

Reply

or to participate.