Financial Intelligence Chatbot

A robust, scalable financial chatbot capable of processing structured and unstructured financial documents. The system extracts key insights, summarizes content, and answers user queries based on uploaded data.

Agentic AI Architecture

This implementation uses an agentic AI architecture powered by Google's Gemini models. The system consists of:

Core Agent: A central coordinator that manages:
- Intent detection
- Tool selection
- Response generation
- Language detection and translation
Tool Registry System: Modular tools that can be dynamically invoked:
- Tools register capabilities and schemas
- The agent selects appropriate tools based on user intent
- Loose coupling allows for easy extension
LLM Integration: Google Gemini integration with:
- Intent detection
- Structured output generation
- Response generation

Features

Process multiple file formats (CSV, Excel, PDF, DOCX)
Extract financial data and insights
Analyze trends and metrics
Support for multiple languages
Real-time chat interface
Document history management
Web search and online content analysis
Advanced data visualization capabilities
Export functionality for session data
Robust database with local fallback storage
Comprehensive history tracking
Third-party charting library integration

Project Structure

FinWise/
├── keys/              # API keys and credentials
├── src/
│   ├── agents/
│   │   └── financial_agent.py  # Core agent coordinator
│   ├── tools/
│   │   ├── file_processor.py   # File processing tool
│   │   ├── financial_analysis_tool.py  # Financial analysis tool
│   │   ├── text_summarization.py  # Text summarization tool
│   │   ├── language_tool.py   # Language detection/translation tool
│   │   ├── web_search_tool.py # Web content extraction tool
│   │   ├── search_api_tool.py # Search API integration
│   │   ├── data_visualization_tool.py # Visualization tool
│   │   ├── csv_analyzer_tool.py # CSV analysis tool
│   │   ├── dynamic_visualization_tool.py # Advanced visualization
│   │   └── tool_registry.py   # Tool registry system
│   ├── llm/
│   │   ├── llm_manager.py     # LLM provider manager
│   │   ├── gemini_integration.py  # Google Gemini integration
│   │   └── groq_integration.py  # Groq LLM integration
│   ├── db/
│   │   └── db_connection.py   # Database connection manager
│   ├── config/
│   │   └── config.py  # Configuration settings
│   ├── models/        # Data models
│   ├── utils/         # Utility functions
│   └── routes/        # API routes
├── docs/
│   ├── screenshots/           # Application screenshots
│   └── diagrams/             # System architecture diagrams
├── uploads/           # Directory for uploaded files
├── local_storage/     # Local storage for database fallback
├── main.py            # Streamlit application
├── system_architecture.md # Architecture diagrams 
├── visualization/ # Visualization capabilities documentation
├── evaluation_report.md # System performance and evaluation report
├── debug_tools.py     # Debugging and tool validation utilities
├── requirements.txt   # Dependencies
└── .env               # Environment variables```

Technology Stack

Language: Python 3.8+
LLM: Google Gemini Pro, Groq
Frontend: Streamlit
Database: MongoDB with local storage fallback
Data Processing:
- Pandas for tabular data
- PyPDF for PDF parsing
- python-docx for DOCX processing
Data Visualization:
- Matplotlib
- Plotly
- Altair
- Seaborn
Web Search:
- SerpAPI integration
- BeautifulSoup for web scraping
Libraries:
- langdetect for language detection
- streamlit for web UI
- langchain for LLM integrations

Installation

Clone the repository

git clone https://github.com/yourusername/financial-intelligence-chatbot.git
cd financial-intelligence-chatbot

Create and activate a virtual environment (recommended)

python -m venv venv
# On Windows
venv\Scripts\activate
# On macOS/Linux
source venv/bin/activate

Install the required dependencies:

pip install -r requirements.txt

Create a .env file with the following variables:

# LLM API Keys
GEMINI_API_KEY=your_gemini_api_key_here
GROQ_API_KEY=your_groq_api_key_here

# SerpAPI for web search
SERPAPI_KEY=your_serpapi_key_here

# MongoDB connection (optional)
MONGODB_CONNECTION_STRING=your_mongodb_connection_string

✅ Steps to Get JSON Key from Google Cloud Console

Go to Google Cloud Console
Open: https://console.cloud.google.com/
Select your project
At the top, click the project dropdown and choose the project you're working with (finbot in your case).
Enable the required API (if not already enabled):
- For Gen Language models or Vertex AI:
  Search for and enable Vertex AI API or Generative Language API.
Go to IAM & Admin > Service Accounts
- Navigation Menu (☰) → IAM & Admin → Service Accounts
Create a New Service Account (or select an existing one)
- Click “Create Service Account”
- Give it a name like gen-lang-client
- Click “Create and Continue”
Assign Role(s)
- Choose appropriate roles. For Gen AI use, select:
  Vertex AI User or Generative Language API User
- Click “Continue” and then “Done”
Generate the Key File (JSON)
- After creation, click the service account name
- Go to the “Keys” tab
- Click “Add Key” → “Create new key”
- Select “JSON” → Click “Create”
- ✅ The .json key file will automatically download.
Move or Save the Key File to Your Path
- Save it to:
  C:\your\path\keys\
  (create the folders if they don’t exist)

Once you’ve downloaded the JSON key file, you need to set an environment variable so your code can authenticate using that key.

✅ Terminal Command to Use the Key (Linux/macOS/WSL)

export GOOGLE_APPLICATION_CREDENTIALS="C:/path/to/project/finbot/keys/gen-lang-client-0049803850-db4864d6249d.json"

✅ Windows Command Prompt (cmd)

set GOOGLE_APPLICATION_CREDENTIALS=C:\path\to\project\finbot\keys\gen-lang-client-0049803850-db4864d6249d.json

✅ Windows PowerShell

$env:GOOGLE_APPLICATION_CREDENTIALS="C:\path\to\project\finbot\keys\gen-lang-client-0049803850-db4864d6249d.json"

⚠️ Important Security Note:

Never share this key publicly. It gives access to your Google Cloud resources. Treat it like a password.

Database Setup

The application supports two database options:

Option 1: MongoDB (Recommended for Production)

Set up a MongoDB database (local or cloud-based like MongoDB Atlas)

Add your connection string to the .env file:

MONGODB_CONNECTION_STRING=mongodb+srv://username:password@cluster.mongodb.net/financial_chatbot

Option 2: Local File Storage (Default)

If no MongoDB connection is provided, the system automatically falls back to local file storage
Local storage files are saved in the local_storage/ directory
This option works out of the box with no additional setup

API Key Management

Multiple API Keys

You can configure multiple API keys for the same LLM provider to enable key rotation and load balancing. To add multiple keys:

In your .env file, add multiple keys using comma separation:
```
OPENAI_API_KEY=key1,key2,key3
GEMINI_API_KEY=key1,key2
```
The system will automatically rotate between these keys to:
- Distribute API calls across multiple keys
- Avoid rate limiting issues
- Provide failover if one key encounters errors

API Key Rotation

The application implements an intelligent key rotation strategy:

Keys are used in sequence, rotating to the next key after each API call
If a key encounters an error (like rate limiting), it's temporarily skipped
The system tracks usage metrics for each key to optimize distribution

Running the Application

Start the Streamlit app:

streamlit run main.py

The application will be available at http://localhost:8501

Command-line Options

You can customize the application startup with these options:

streamlit run main.py -- --port 8502 --no-db-connection --mock-llm

Available options:

--port: Specify a custom port (default: 8501)
--no-db-connection: Run without attempting database connection
--mock-llm: Use mock LLM responses for testing without API costs

How It Works

User Query Processing:
- The system analyzes the user's query to determine intent
- It selects the appropriate tools based on the detected intent
- The agent coordinates tool execution
Document Processing:
- Files are uploaded through the Streamlit interface
- The file processor tool extracts content based on file type
- Different processing strategies are applied for different formats
Analysis & Insights:
- The financial analysis tool extracts trends, metrics, and insights
- Results are formatted for easy understanding
Response Generation:
- The agent generates a natural language response using Gemini
- Responses incorporate insights from the tool executions
Multilingual Support:
- Language is detected automatically
- Users can select their preferred language
- Responses are translated to the selected language
Web Research:
- Extract content from URLs mentioned in queries
- Perform web searches for relevant financial information
- Summarize online content and provide source citations
Data Visualization:
- Analyze data to determine appropriate visualization types
- Generate charts based on financial data patterns
- Support multiple visualization libraries for different chart types
- Save visualizations for future reference
Export & History Management:
- Track session history including chat, documents, and searches
- Export data in multiple formats (JSON, CSV)
- Generate specialized reports for different aspects of analysis
- Access document processing history for audit purposes

UI Features and Navigation

The application interface is designed for intuitive navigation:

Main Areas

Chat Interface: Central area for conversation with the AI
Sidebar: Contains settings, document management, and history
Fixed Chat Input: Always accessible at the bottom of the screen

Sidebar Sections

Chat Settings: Language selection and session management
Documents: Upload and manage financial documents
Visualizations: View saved charts and graphs
History: Browse conversation history
Document History: Track document processing actions
Search History: View past web searches
Export Data: Export session data in various formats
About: Information about the application

Export Features

The application provides comprehensive export capabilities:

Export Formats:
- JSON: Complete data export with all details
- CSV: Spreadsheet-friendly format for easy analysis
- Full Archive: Complete session backup
Report Types:
- Chat History: Export conversations with timestamps
- Document Summary: Summary of all processed documents
- Visualization History: Export of charts and analysis
- Activity Log: Complete audit trail of system usage
Export Process:
1. Navigate to the "Export Data" tab in the sidebar
2. Select your preferred export format
3. Choose whether to include visualizations
4. Click "Export Current Session" or select a specific report
5. Download the generated file

Advanced Visualization

The system supports multiple charting libraries for different visualization needs:

Built-in Libraries:
- Matplotlib: Standard static charts
- Plotly: Interactive visualizations
- Altair: Declarative charts
- Seaborn: Statistical vis 8000 ualizations
Chart Types:
- Line charts for trend analysis
- Bar charts for comparisons
- Scatter plots for correlation analysis
- Pie charts for composition analysis
- Heatmaps for complex data patterns
- Box plots for distribution analysis
Dynamic Chart Selection:
- The system automatically determines the most appropriate chart type based on data characteristics
- Users can specify preferred chart types in their queries
- Charts adapt to data size and dimensionality

Extending the System

To add new tools:

Create a new tool class that inherits from the Tool base class
Implement the execute method with your tool's functionality
Define input and output schemas
Register the tool with the tool_registry

Example of a new tool implementation:

from src.tools.tool_registry import Tool, tool_registry

class NewFinancialTool(Tool):
    """Tool for specialized financial analysis."""
    
    name = "NewFinancialTool"
    description = "Performs specialized financial analysis"
    
    input_schema = {
        "type": "object",
        "properties": {
            "data": {
                "type": "string",
                "description": "Financial data to analyze"
            }
        },
        "required": ["data"]
    }
    
    output_schema = {
        "type": "object",
        "properties": {
            "result": {
                "type": "string",
                "description": "Analysis result"
            }
        }
    }
    
    def execute(self, data: str) -> dict:
        """Execute the tool with the provided data."""
        # Your implementation here
        result = self._analyze_data(data)
        return {"result": result}
    
    def _analyze_data(self, data: str) -> str:
        # Custom analysis logic
        return "Analysis results"

# Register the tool
tool_registry.register(NewFinancialTool)

System Architecture

The system architecture is documented in detail in the system_architecture.md file, which contains Mermaid diagrams illustrating:

System Architecture Diagram: High-level components and their relationships
Block Diagram: Logical organization of the system's layers
Workflow Diagram: Sequence of operations for typical user interactions
Component Hierarchy: Organization and nesting of system components
Data Flow Diagram: How data moves through the system
Entity-Relationship Diagram: Data relationships
State Diagram: System states during different operations
Deployment Diagram: Physical deployment structure

To view these diagrams, open the file in a Markdown viewer that supports Mermaid, such as:

GitHub's web interface
VS Code with the Markdown Preview Mermaid Support extension
Various online Mermaid viewers

API Key Rotation

The application supports API key rotation for Google Gemini to help manage rate limits and ensure uninterrupted service:

Multiple API Keys:
- The system can use multiple Gemini API keys
- Keys are automatically rotated when rate limits are reached
Adding API Keys:
- Open src/llm/gemini_integration.py
- Add your additional API keys to the API_KEYS list
- Example: API_KEYS = [GEMINI_API_KEY, "YOUR_SECOND_API_KEY_HERE", "YOUR_THIRD_API_KEY_HERE"]
Benefits:
- Increased resilience against rate limiting
- Higher throughput for LLM requests
- Seamless fallback when one key reaches its quota

Multi-Provider LLM Support

The chatbot supports multiple LLM providers with automatic fallback functionality:

Available Providers:
- Google Gemini: Primary provider for most queries
- Groq: Used as a fallback when Gemini encounters errors or rate limits
Setting Up Groq:
- Create a Groq account at groq.com
- Get your API key from the Groq console
- Add it to your .env file: GROQ_API_KEY=your_groq_api_key_here
Provider Management:
- The system automatically switches between providers when errors occur
- Error detection identifies rate limits and quota issues
- Cooldown periods are managed for each provider
Benefits:
- Increased reliability through provider redundancy
- Continued operation during provider outages
- Ability to leverage the unique strengths of each model

Troubleshooting

Database Connection Issues

Check your MongoDB connection string in the .env file
Ensure your IP address is whitelisted in MongoDB Atlas (if using Atlas)
The system will automatically fall back to local storage if the database is unavailable

LLM API Key Issues

Verify your API keys in the .env file
Check quota limits on your LLM provider accounts
The system will attempt to use alternative providers if available

File Processing Errors

Ensure uploaded files are in supported formats
Check file encoding (UTF-8 is recommended)
Large files may need to be split into smaller pieces

Web Search Issues

Verify your SerpAPI key
Check your internet connection
Some websites may block content extraction

Contributing

We welcome contributions to improve the Financial Intelligence Chatbot:

Fork the repository
Create a feature branch: git checkout -b feature/your-feature-name
Commit your changes: git commit -m 'Add some feature'
Push to the branch: git push origin feature/your-feature-name
Open a Pull Request

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
RecForth		RecForth
docs		docs
files		files
src		src
uploads		uploads
visualizations		visualizations
.gitignore		.gitignore
Assignment_Financial_Chatbot.pdf		Assignment_Financial_Chatbot.pdf
README.md		README.md
debug_tools.py		debug_tools.py
evaluation_report.md		evaluation_report.md
main.py		main.py
requirements.txt		requirements.txt
system_architecture.md		system_architecture.md

rajveer43/finebot

Folders and files

Latest commit

History

Repository files navigation

Financial Intelligence Chatbot

Agentic AI Architecture

Features

Project Structure

Technology Stack

Installation

✅ Steps to Get JSON Key from Google Cloud Console

✅ Terminal Command to Use the Key (Linux/macOS/WSL)

✅ Windows Command Prompt (cmd)

✅ Windows PowerShell

⚠️ Important Security Note:

Database Setup

Option 1: MongoDB (Recommended for Production)

Option 2: Local File Storage (Default)

API Key Management

Multiple API Keys

API Key Rotation

Running the Application

Command-line Options

How It Works

UI Features and Navigation

Main Areas

Sidebar Sections

Export Features

Advanced Visualization

Extending the System

System Architecture

API Key Rotation

Multi-Provider LLM Support

Troubleshooting

Database Connection Issues

LLM API Key Issues

File Processing Errors

Web Search Issues

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages