- unwind ai
- Posts
- Build an AI Startup Insight Agent with FIRE-1
Build an AI Startup Insight Agent with FIRE-1
Fully functional agent app with step-by-step instructions (100% opensource)
While working with web data, we keep facing the challenge of extracting structured information from dynamic, modern websites. Traditional scraping methods often break when coming across JavaScript-heavy interfaces, login requirements, and interactive elements - leading to brittle solutions that require constant maintenance.
In this tutorial, we're building an AI Startup Insight application that uses Firecrawl's FIRE-1 agent for robust web extraction. FIRE-1 is an AI agent that can autonomously perform browser actions - clicking buttons, filling forms, navigating pagination, and interacting with dynamic content - while understanding the semantic context of what it's extracting. We'll combine this with OpenAI's GPT-4o to create a complete pipeline from data extraction to analysis in a clean Streamlit interface. Weโll use Agno framework to build our AI startup insight agent.
The FIRE-1 agent solves a key developer pain point: instead of writing custom selectors and JavaScript handlers for each website, you can simply define the data schema you want and provide natural language instructions. The agent handles the complexities of web navigation and extraction, dramatically reducing development time and maintenance overhead.
What Weโre Building
An advanced web extraction and analysis tool built using Firecrawl's FIRE-1 agent + extract v1 endpoint and the Agno Agent framework to get details of a new startup instantly! This application automatically extracts structured data from startup websites and provides AI-powered business analysis, making it easy to gather insights about companies without manual research.
Features
๐ Intelligent Web Extraction:
Extract structured data from any company website
Automatically identify company information, mission, and product features
Process multiple websites in sequence
๐ Advanced Web Navigation:
Interact with buttons, links, and dynamic elements
Handle pagination and multi-step processes
Access information across multiple pages
๐ง AI Business Analysis:
Generate insightful summaries of extracted company data
Identify unique value propositions and market opportunities
Provide actionable business intelligence
๐ Structured Data Output:
Organize information in a consistent JSON schema
Extract company name, description, mission, and product features
Standardize output for further processing
๐ฏ Interactive UI:
User-friendly Streamlit interface
Process multiple URLs in parallel
Clear presentation of extracted data and analysis
How The App Works
The application workflow follows these steps:
URL Collection: The user enters one or more company website URLs in the text area.
FIRE-1 Extraction: When the user clicks "Start Analysis," Firecrawl's FIRE-1 agent is deployed with:
A detailed prompt that guides the agent on what information to extract
A JSON schema that defines the structure of the output data
Parameters that configure the agent's behavior
Intelligent Navigation: Unlike traditional scrapers that only read static HTML, the FIRE-1 agent:
Actively clicks buttons and interacts with dynamic elements
Navigates through multiple pages when necessary
Uses AI reasoning to identify the most relevant content
Data Structuring: The agent automatically converts unstructured web content into a JSON object according to our schema.
AI Analysis: The Agno Agent (powered by GPT-4o) processes the structured data to generate business insights.
Data Presentation: Results are displayed in a tabbed interface for easy comparison of multiple companies.
Prerequisites
Before we begin, make sure you have the following:
Python installed on your machine (version 3.10 or higher is recommended)
Your OpenAI and Firecrawl API keys
A code editor of your choice (we recommend VS Code or PyCharm for their excellent Python support)
Basic familiarity with Python programming
Code Walkthrough
Setting Up the Environment
First, let's get our development environment ready:
Clone the GitHub repository:
git clone https://github.com/Shubhamsaboo/awesome-llm-apps.git
Go to the ai_startup_insight_fire1_agent folder:
cd advanced_ai_agents/single_agent_apps/ai_startup_insight_fire1_agent
Install the required dependencies:
pip install -r requirements.txt
Grab your API Keys:
Get Firecrawl API key from Firecrawl
Get OpenAI API key from OpenAI Platform
Creating the Streamlit App
Letโs create our app. Create a new file ai_startup_insight_fire1_agent.py
and add the following code:
Import necessary libraries:
from firecrawl import FirecrawlApp
import streamlit as st
import os
import json
from agno.agent import Agent
from agno.models.openai import OpenAIChat
Set up the Streamlit page configuration:
st.set_page_config(
page_title="Startup Info Extraction",
page_icon="๐",
layout="wide"
)
st.title("AI Startup Insight with Firecrawl's FIRE-1 Agent")
Create a sidebar for API key configuration:
with st.sidebar:
st.header("API Configuration")
firecrawl_api_key = st.text_input("Firecrawl API Key", type="password")
openai_api_key = st.text_input("OpenAI API Key", type="password")
st.caption("Your API keys are securely stored and not shared.")
st.markdown("---")
st.markdown("### About")
st.markdown("This tool extracts company information from websites using Firecrawl's FIRE-1 agent and provides AI-powered business analysis.")
st.markdown("### How It Works")
st.markdown("1. ๐ **FIRE-1 Agent** extracts structured data from websites")
st.markdown("2. ๐ง **Agno Agent** analyzes the data for business insights")
st.markdown("3. ๐ **Results** are presented in an organized format")
Create the main content area with information about Firecrawl capabilities:
st.markdown("## ๐ฅ Firecrawl FIRE-1 Agent Capabilities")
col1, col2 = st.columns(2)
with col1:
st.info("**Advanced Web Extraction**\n\nFirecrawl's FIRE-1 agent combined with the extract endpoint can intelligently navigate websites to extract structured data, even from complex layouts and dynamic content.")
st.success("**Interactive Navigation**\n\nThe agent can interact with buttons, links, input fields, and other dynamic elements to access hidden information.")
with col2:
st.warning("**Multi-page Processing**\n\nFIRE can handle pagination and multi-step processes, allowing it to gather comprehensive data across entire websites.")
st.error("**Intelligent Data Structuring**\n\nThe agent automatically structures extracted information according to your specified schema, making it immediately usable.")
st.markdown("---")
st.markdown("### ๐ Enter Website URLs")
st.markdown("Provide one or more company website URLs (one per line) to extract information.")
website_urls = st.text_area("Website URLs (one per line)", placeholder="https://example.com\nhttps://another-company.com")
Define the extraction schema for structured data:
extraction_schema = {
"type": "object",
"properties": {
"company_name": {
"type": "string",
"description": "The official name of the company or startup"
},
"company_description": {
"type": "string",
"description": "A description of what the company does and its value proposition"
},
"company_mission": {
"type": "string",
"description": "The company's mission statement or purpose"
},
"product_features": {
"type": "array",
"items": {
"type": "string"
},
"description": "Key features or capabilities of the company's products/services"
},
"contact_phone": {
"type": "string",
"description": "Company's contact phone number if available"
}
},
"required": ["company_name", "company_description", "product_features"]
}
Add custom CSS for a better UI:
st.markdown("""
<style>
.stButton button {
background-color: #FF4B4B;
color: white;
font-weight: bold;
border-radius: 10px;
padding: 0.5rem 1rem;
border: none;
box-shadow: 0 4px 6px rgba(0, 0, 0, 0.1);
transition: all 0.3s ease;
}
.stButton button:hover {
background-color: #FF2B2B;
box-shadow: 0 6px 8px rgba(0, 0, 0, 0.15);
transform: translateY(-2px);
}
.css-1r6slb0 {
border-radius: 10px;
box-shadow: 0 4px 6px rgba(0, 0, 0, 0.1);
}
</style>
""", unsafe_allow_html=True)
Implement the core extraction and analysis logic:
if st.button("๐ Start Analysis", type="primary"):
if not website_urls.strip():
st.error("Please enter at least one website URL")
else:
try:
with st.spinner("Extracting information from website..."):
# Initialize the FirecrawlApp with the API key
app = FirecrawlApp(api_key=firecrawl_api_key)
# Parse the input URLs more robustly
urls = [url.strip() for url in website_urls.split('\n') if url.strip()]
# Debug: Show the parsed URLs
st.info(f"Attempting to process these URLs: {urls}")
if not urls:
st.error("No valid URLs found after parsing. Please check your input.")
elif not openai_api_key:
st.warning("Please provide an OpenAI API key in the sidebar to get AI analysis.")
else:
# Create tabs for each URL
tabs = st.tabs([f"Website {i+1}: {url}" for i, url in enumerate(urls)])
# Initialize the Agno agent once (outside the loop)
if openai_api_key:
agno_agent = Agent(
model=OpenAIChat(id="gpt-4o", api_key=openai_api_key),
instructions="""You are an expert business analyst who provides concise, insightful summaries of companies.
You will be given structured data about a company including its name, description, mission, and product features.
Your task is to analyze this information and provide a brief, compelling summary that highlights:
1. What makes this company unique or innovative
2. The core value proposition for customers
3. The potential market impact or growth opportunities
Keep your response under 150 words, be specific, and focus on actionable insights.""",
markdown=True
)
# Process each URL one at a time
for i, (url, tab) in enumerate(zip(urls, tabs)):
with tab:
st.markdown(f"### ๐ Analyzing: {url}")
st.markdown("<hr style='border: 2px solid #FF4B4B; border-radius: 5px;'>", unsafe_allow_html=True)
with st.spinner(f"FIRE agent is extracting information from {url}..."):
try:
# Extract data for this single URL
data = app.extract(
[url], # Pass as a list with a single URL
params={
'prompt': '''
Analyze this company website thoroughly and extract comprehensive information.
1. Company Information:
- Identify the official company name
Explain: This is the legal name the company operates under.
- Extract a detailed yet concise description of what the company does
- Find the company's mission statement or purpose
Explain: What problem is the company trying to solve? How do they aim to make a difference?
2. Product/Service Information:
- Identify 3-5 specific product features or service offerings
Explain: What are the key things their product or service can do? Describe as if explaining to a non-expert.
- Focus on concrete capabilities rather than marketing claims
Explain: What does the product actually do, in simple terms, rather than how it's advertised?
- Be specific about what the product/service actually does
Explain: Give examples of how a customer might use this product or service in their daily life.
3. Contact Information:
- Find direct contact methods (phone numbers)
Explain: How can a potential customer reach out to speak with someone at the company?
- Only extract contact information that is explicitly provided
Explain: We're looking for official contact details, not inferring or guessing.
Important guidelines:
- Be thorough but concise in your descriptions
- Extract factual information, not marketing language
- If information is not available, do not make assumptions
- For each piece of information, provide a brief, simple explanation of what it means and why it's important
- Include a layman's explanation of what the company does, as if explaining to someone with no prior knowledge of the industry or technology involved
''',
'schema': extraction_schema,
'agent': {"model": "FIRE-1"}
}
)
# Check if extraction was successful
if data and data.get('data'):
# Display extracted data
st.subheader("๐ Extracted Information")
company_data = data.get('data')
# Display company name prominently
if 'company_name' in company_data:
st.markdown(f"## {company_data['company_name']}")
# Display other extracted fields
for key, value in company_data.items():
if key == 'company_name':
continue # Already displayed above
display_key = key.replace('_', ' ').capitalize()
if value: # Only display if there's a value
if isinstance(value, list):
st.markdown(f"**{display_key}:**")
for item in value:
st.markdown(f"- {item}")
elif isinstance(value, str):
st.markdown(f"**{display_key}:** {value}")
elif isinstance(value, bool):
st.markdown(f"**{display_key}:** {str(value)}")
else:
st.write(f"**{display_key}:**", value)
# Process with Agno agent
if openai_api_key:
with st.spinner("Generating AI analysis..."):
# Run the agent with the extracted data
agent_response = agno_agent.run(f"Analyze this company data and provide insights: {json.dumps(company_data)}")
# Display the agent's analysis in a highlighted box
st.subheader("๐ง AI Business Analysis")
st.markdown(agent_response.content)
# Show raw data in expander
with st.expander("๐ View Raw API Response"):
st.json(data)
# Add processing details
with st.expander("โน๏ธ Processing Details"):
st.markdown("**FIRE Agent Actions:**")
st.markdown("- ๐ Scanned website content and structure")
st.markdown("- ๐ฑ๏ธ Interacted with necessary page elements")
st.markdown("- ๐ Extracted and structured data according to schema")
st.markdown("- ๐ง Applied AI reasoning to identify relevant information")
if 'status' in data:
st.markdown(f"**Status:** {data['status']}")
if 'expiresAt' in data:
st.markdown(f"**Data Expires:** {data['expiresAt']}")
else:
st.error(f"No data was extracted from {url}. The website might be inaccessible, or the content structure may not match the expected format.")
except Exception as e:
st.error(f"Error processing {url}: {str(e)}")
except Exception as e:
st.error(f"Error during extraction: {str(e)}")
Running the App
With our code in place, it's time to launch the app.
In your terminal, navigate to the project folder, and run the following command
streamlit run ai_startup_insight_fire1_agent.py
Streamlit will provide a local URL (typically http://localhost:8501).
To use the app:
Enter your API keys in the sidebar:
Firecrawl API key
OpenAI API key
In the main area, enter company website URLs (one per line). Try these examples:
Click "๐ Start Analysis" to begin the extraction and analysis process.
The FIRE-1 agent will process each URL, extracting structured information according to our schema.
The app will display the extracted data and AI-generated business analysis in the tabbed interface.
Working Application Demo
Conclusion
You've successfully built a powerful AI Startup Insight tool that combines advanced web extraction with business intelligence. This application demonstrates how AI agents can automate complex workflows that would typically require significant custom development.
To enhance this project further, consider:
Scheduled Data Collection: Add functionality to periodically scan competitor websites for changes.
Data Comparison Tools: Implement visual comparisons between multiple startups in the same space.
Export Capabilities: Add options to export the structured data to CSV, Excel, or directly to a CRM.
Historical Tracking: Store extracted data in a database to track how company information changes over time.
Keep experimenting with different configurations and features to build more sophisticated AI applications.
We share hands-on tutorials like this 2-3 times a week, to help you stay ahead in the world of AI. If you're serious about leveling up your AI skills and staying ahead of the curve, subscribe now and be the first to access our latest tutorials.
Reply