8000 GitHub - pauljbernard/SWEAgent: A Generic Agent for Software Engineering.
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

pauljbernard/SWEAgent

Β 
Β 

Repository files navigation

πŸš€ OpenDeepWiki: AI-Powered Multi-Repository Documentation & Chat

OpenDeepWiki is an advanced AI-powered tool that helps you understand and interact with multiple codebases simultaneously. It automatically analyzes repositories, generates comprehensive documentation, and provides an intelligent chat interface where you can ask questions about your code across multiple projects.

GitHub

✨ Key Features

πŸ”„ Multi-Repository Support

  • Multiple Repository Management: Load and manage multiple repositories simultaneously
  • Unified Chat Interface: Ask questions across all your repositories in a single conversation
  • Optimized Pipeline: Efficient processing with individual context retrieval but unified AI response generation
  • Repository Session Management: Thread-safe handling of multiple repository sessions
  • Smart Repository Toggling: Activate/deactivate repositories for targeted queries

🎨 Modern UI Experience

  • Glass-morphism Design: Beautiful modern interface with backdrop blur effects
  • Animated Interactions: Smooth hover effects, transitions, and loading animations
  • Smart Status System: Context-aware status messages with emoji indicators
  • Professional Repository Cards: Modern card design with gradient borders and hover effects
  • Intuitive Repository Manager: Easy-to-use interface for adding, removing, and managing repositories

🧠 Advanced AI Capabilities

  • πŸ” Intelligent Code Analysis: Automatically classifies and analyzes code files, documentation, and configuration files
  • πŸ’¬ Multi-Repository AI Chat: Ask questions about your codebase and get contextual answers from AI models that understand your specific code across multiple projects
  • πŸ“š Cross-Repository Documentation: Extracts and processes docstrings, README files, and documentation from all loaded repositories
  • πŸ€– Dynamic Model Selection: Type any model name from any provider - supports Gemini, Claude, and OpenAI models with automatic routing
  • ⚑ Optimized Context Caching: Gemini Context Caching with 30-minute TTL for cost-effective AI responses
  • 🎯 Universal Model Support: Use cutting-edge models like gpt-4.1, claude-4-sonnet, o3, gemini-2.5-pro and more

πŸ”§ Technical Excellence

  • 🌐 Modern Web UI: Clean, responsive React interface with conversation history, markdown rendering, and syntax highlighting
  • πŸ”— Flexible Input: Supports both GitHub repositories (via URL) and local repositories (via ZIP upload)
  • πŸ‹ Containerized: Fully containerized with Docker for easy deployment
  • πŸ“Š Advanced Conversation Management: Save, load, and manage multiple conversation threads with repository context
exemple_1

πŸ—οΈ Architecture

OpenDeepWiki uses an optimized microservice architecture designed for multi-repository processing:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Frontend      β”‚    β”‚   Controller    β”‚    β”‚   Indexer       β”‚
β”‚   (React)       │◄──►│   (Flask)       │◄──►│   (FastAPI)     β”‚
β”‚   Port: 7860    β”‚    β”‚   Port: 5050    β”‚    β”‚   Port: 8002    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚
                                β–Ό
                       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                       β”‚   Repo Chat     β”‚
                       β”‚   (FastAPI)     β”‚
                       β”‚   Port: 8001    β”‚
                       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Optimized Multi-Repository Pipeline

The architecture implements an efficient multi-repository processing pipeline:

  1. Individual Repository Processing: Each repository runs through steps 1-8 (query rewriting, context caching, retrieval) independently
  2. Unified Response Generation: All retrieved contexts are combined for a single call to the Final Response Generator
  3. Cost Optimization: Reduces AI API calls while maintaining comprehensive multi-repository awareness
  4. Session Management: Thread-safe handling of multiple repository sessions with conflict resolution

Services

  • Frontend (React + Vite): Modern web interface with TypeScript support and glass-morphism design
  • Controller (Flask): Enhanced API gateway with multi-repository session management
  • Indexer Service (FastAPI): Analyzes and classifies repository files, extracts documentation with conflict resolution
  • Repo Chat Service (FastAPI): Provides AI-powered responses using multi-repository context aggregation

πŸš€ Quick Start

Prerequisites

  • Docker and Docker Compose
  • Git (for cloning repositories)

1. Clone the Repository

git clone https://github.com/Flopsky/OpenDeepWiki.git
cd OpenDeepWiki

2. Configure Environment

Create a .env file with your API keys:

# Copy the example environment file
make env

# Edit .env with your API keys
GEMINI_API_KEY=your_gemini_api_key_here
ANTHROPIC_API_KEY=your_anthropic_api_key_here  # Optional
OPENAI_API_KEY=your_openai_api_key_here        # Optional

# Optional: Langfuse tracing
LANGFUSE_PUBLIC_KEY=your_langfuse_public_key
LANGFUSE_SECRET_KEY=your_langfuse_secret_key
LANGFUSE_HOST=https://cloud.langfuse.com

3. Build and Run

Using the Makefile (recommended):

# Setup everything (environment, build, and run)
make setup

# Or step by step:
make build  # Build Docker image
make run    # Run the container

Or using Docker directly:

# Build the Docker image
docker build -t opendeepwiki .

# Run the container
docker run -d --name opendeepwiki_app \
  -p 7860:7860 \
  -p 5050:5050 \
  -p 8001:8001 \
  -p 8002:8002 \
  --env-file .env \
  opendeepwiki

4. Access the Application

Open your browser and navigate to: http://localhost:7860

πŸ’‘ How to Use

Managing Multiple Repositories

  1. Adding Repositories:

    • GitHub Repository: Click "Add Repository", paste the GitHub URL (e.g., https://github.com/username/repo)
    • Local Repository: Click "Upload ZIP" and select your zipped repository
  2. Repository Management:

    • View All Repositories: See all loaded repositories in the modern repository manager
    • Toggle Active/Inactive: Use the toggle button to activate/deactivate repositories for queries
    • Remove Repositories: Hover over repository cards to reveal the delete button
    • Status Monitoring: Visual indicators show repository status (Ready, Loading, Error)

Dynamic AI Model Selection

  1. Choose Any Model: Type any model name directly in the model selector

    • OpenAI: gpt-4o, gpt-4-turbo, gpt-3.5-turbo, o1-preview, o1-mini, o3-mini
    • Anthropic: claude-3.5-sonnet-20241022, claude-3-haiku-20240307, claude-3-opus-20240229
    • Google: gemini-2.5-pro-preview-03-25, gemini-1.5-flash-8b-001, gemini-1.5-pro-002
  2. Smart Auto-Complete: Get suggestions for popular models while typing

  3. Automatic Routing: The system automatically detects which provider to use based on model name

  4. API Key Management: Configure API keys for each provider in the settings

Multi-Repository Chat Experience

  1. Repository Selection:

    • Activate the repositories you want to query by toggling them "on"
    • The active repository count is displayed in the header
    • Blue indicators show which repositories are active for queries
  2. Cross-Repository Queries:

    • Ask questions that span multiple repositories: "Compare the authentication systems in my projects"
    • Get unified responses that understand relationships between different codebases
    • Responses automatically indicate which repositories contributed to the answer
  3. Smart Context Management:

    • Each repository maintains its own optimized context cache
    • Queries intelligently combine context from all active repositories
    • Single AI call processes all repository contexts for cost efficiency

Example Multi-Repository Queries

  • "How do the authentication systems differ between my frontend and backend repositories?"
  • "What are the common patterns used across all my projects?"
  • "Show me how to integrate the API from repo A with the frontend from repo B"
  • "Compare the database schemas in my different microservices"
  • "What dependencies are shared across my repositories?"

Managing Conversations

  • New Chat: Click "New Chat" to start a fresh conversation
  • Switch Conversations: Click on any saved conversation in the sidebar
  • Delete Conversations: Use the trash icon next to conversations
  • Repository Context: Conversations remember which repositories were active
  • Persistent History: All conversations are automatically saved with repository context

πŸ€– Universal AI Model Support

OpenDeepWiki features Dynamic Model Selection that automatically routes requests to the appropriate AI provider based on the model name you type. This revolutionary approach means you can use any model from any supported provider without changing settings or configurations.

How It Works

  1. Type Any Model Name: Simply enter the model name in the model selector
  2. Automatic Detection: The system detects the provider based on naming patterns
  3. Smart Routing: Your request is automatically routed to the correct API
  4. Seamless Experience: All models work identically through the same interface

Supported Models

Provider Model Examples Naming Pattern
OpenAI gpt-4o, gpt-4-turbo, gpt-3.5-turbo, o1-preview, o1-mini, o3-mini Contains gpt or starts with o
Anthropic claude-3.5-sonnet-20241022, claude-3-haiku-20240307, claude-3-opus-20240229 Contains claude
Google gemini-2.5-pro-preview-03-25, gemini-1.5-flash-8b-001, gemini-1.5-pro-002 Starts with gemini-

Key Benefits

  • 🎯 Zero Configuration: No need to change settings when switching models
  • πŸš€ Future-Proof: New models work automatically if they follow naming conventions
  • πŸ’‘ Intelligent: Case-insensitive detection with smart fallbacks
  • ⚑ Unified Interface: All models provide the same rich experience
  • πŸ”„ Easy Switching: Try different models instantly to compare results

🧠 Advanced Gemini Context Caching Technology

OpenDeepWiki leverages Gemini Context Caching with an optimized multi-repository architecture to provide efficient and cost-effective AI responses across multiple codebases.

Multi-Repository Context Caching

  1. Individual Repository Analysis: Each repository gets its own:

    • Comprehensive documentation extraction and analysis
    • Unique cached context with repository-specific display names
    • Conflict resolution for duplicate repository names
    • Independent cache lifecycle management
  2. Optimized Cache Strategy:

    • Unique Display Names: Repositories get unique identifiers using timestamps and content hashes
    • Cache Reuse: Identical repositories automatically reuse existing caches
    • Cleanup Management: Maintains 2 most recent caches per repository
    • Conflict Resolution: Handles multiple repositories with similar names gracefully
  3. Unified Query Processing:

    • Individual Processing: Steps 1-8 (query rewriting, context retrieval) run separately for each active repository
    • Combined Context: All repository contexts are aggregated for final response generation
    • Single AI Call: Only one call to Final Response Generator, reducing costs while maintaining comprehensive awareness
    • Attribution: Responses indicate which repositories contributed to the answer

Technical Implementation

# Multi-repository pipeline optimization
def run_multi_repo_pipeline(query, repositories):
    contexts = []
    
    # Process each repository individually (steps 1-8)
    for repo in repositories:
        context = run_pipeline_up_to_context_retrieval(query, repo)
        contexts.append(context)
    
    # Single unified response generation (step 9)
    return generate_final_response(query, combined_contexts=contexts)

# Enhanced cache creation with unique naming
cache = caching.CachedContent.create(
    model=CONTEXT_CACHING_RETRIVER,
    display_name=f"{repo_name}_{timestamp}_{hash}",
    contents=documentation_json,
    system_instruction=system_prompt,
    ttl=datetime.timedelta(minutes=30)
)

Benefits for Multi-Repository Workflows

  • ⚑ Scalable Performance: Parallel processing of repositories with optimized caching
  • πŸ’° Cost Efficiency: Single AI call for final response while maintaining full multi-repo context
  • 🎯 Comprehensive Understanding: AI has complete awareness of all active repository structures
  • πŸ”„ Smart Reuse: Automatic cache detection and reuse across sessions
  • πŸ“Š Advanced Management: Sophisticated cache lifecycle with conflict resolution
  • πŸ”— Cross-Repository Intelligence: Understands relationships and patterns across multiple codebases

πŸ› οΈ Development

Requirements

  • Python 3.12+
  • Node.js 18+
  • Docker

Local Development Setup

  1. Backend Services:

    # Install Python dependencies
    pip install -r requirements.txt
    
    # Run indexer service
    python -m indexer.server
    
    # Run repo chat service
    python -m repo_chat.server
    
    # Run controller
    python frontend/src/controler.py
  2. Frontend:

    cd frontend
    npm 
    6D40
    install
    npm run dev

Testing Multi-Repository Features

# Test multi-repository API endpoints
curl -X POST http://localhost:5050/api/add_repo \
  -H "Content-Type: application/json" \
  -d '{"repo_url": "https://github.com/user/repo1"}'

curl -X GET http://localhost:5050/api/list_repos

# Test multi-repository chat
curl -X POST http://localhost:8001/multi_repo_score \
  -H "Content-Type: application/json" \
  -d '{"repositories": [...]}'

Makefile Commands

make help           # Show available commands
make env            # Create .env from template
make build          # Build Docker image
make run            # Run container
make stop           # Stop container
make restart        # Restart container
make logs           # View container logs
make clean          # Remove container and image
make prune-all      # Full cleanup including unused Docker objects

🧩 Technology Stack

Frontend

  • React 19 with TypeScript
  • Vite for build tooling
  • Modern CSS with glass-morphism effects
  • React Router for navigation
  • React Markdown for rendering
  • React Syntax Highlighter for code display
  • Advanced Animations with CSS keyframes

Backend

  • FastAPI for microservices (Indexer, Repo Chat)
  • Flask for API gateway (Controller) with session management
  • Pydantic for data validation
  • Python 3.12 runtime
  • Thread-safe multi-repository handling

AI & APIs

  • Google Gemini with context caching (gemini-*, automatic API routing)
  • Anthropic Claude (claude-*, automatic API routing)
  • OpenAI GPT & Reasoning Models (gpt-, o, automatic API routing)
  • Dynamic Model Selection with intelligent provider detection
  • Langfuse (optional tracing)
  • Optimized Pipeline for multi-repository processing

Infrastructure

  • Docker for containerization
  • Supervisord for process management
  • Nginx for static file serving (in container)

πŸ“ Project Structure

OpenDeepWiki/
β”œβ”€β”€ frontend/               # React frontend application
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ controler.py   # Flask API gateway with multi-repo support
β”‚   β”‚   β”œβ”€β”€ components/
β”‚   β”‚   β”‚   └── RepositoryManager.jsx  # Multi-repository management
β”‚   β”‚   β”œβ”€β”€ services/
β”‚   β”‚   β”‚   └── api.js     # Enhanced API with multi-repo endpoints
β”‚   β”‚   β”œβ”€β”€ styles/
β”‚   β”‚   β”‚   └── opendeepwiki-theme.css # Modern UI styles
β”‚   β”‚   └── ...            # React components and pages
β”‚   β”œβ”€β”€ package.json
β”‚   └── vite.config.js
β”œβ”€β”€ indexer/               # File classification service
β”‚   β”œβ”€β”€ server.py         # FastAPI server
β”‚   β”œβ”€β”€ service.py        # Classification logic
β”‚   └── schema.py         # Data models
β”œβ”€β”€ repo_chat/            # AI chat service
β”‚   β”œβ”€β”€ server.py         # FastAPI server with multi-repo endpoints
β”‚   β”œβ”€β”€ service.py        # Enhanced chat logic with multi-repo pipeline
β”‚   └── schema.py         # Data models
β”œβ”€β”€ src/                  # Core utilities and shared code
β”‚   β”œβ”€β”€ core/            # Core functionality with cache management
β”‚   β”œβ”€β”€ utils/           # Utility functions
β”‚   └── schemas/         # Shared data models
β”œβ”€β”€ MULTI_REPO_ARCHITECTURE.md  # Detailed architecture documentation
β”œβ”€β”€ Dockerfile            # Container definition
β”œβ”€β”€ supervisord.conf      # Process management
β”œβ”€β”€ Makefile             # Build and deployment commands
β”œβ”€β”€ requirements.txt      # Python dependencies
└── README.md            # This file

πŸ”§ Configuration

Environment Variables

Variable Required Description
GEMINI_API_KEY βœ… Yes Google Gemini API key (required for gemini-* models)
ANTHROPIC_API_KEY ❌ No Anthropic Claude API key (required for claude-* models)
OPENAI_API_KEY ❌ No OpenAI API key (required for gpt-* and o* models)
LANGFUSE_PUBLIC_KEY ❌ No Langfuse public key for tracing
LANGFUSE_SECRET_KEY ❌ No Langfuse secret key for tracing
LANGFUSE_HOST ❌ No Langfuse host URL

Supported Model Patterns

The system automatically routes to the correct provider based on model name:

  • Gemini Models: Any model starting with gemini- (e.g., gemini-2.5-pro-preview-03-25)
  • OpenAI Models: Any model containing gpt or starting with o (e.g., gpt-4o, o1-preview)
  • Claude Models: Any model containing claude (e.g., claude-3.5-sonnet-20241022)
  • Fallback: Unknown models default to Gemini for backward compatibility

Ports

Service Port Description
Frontend 7860 Main web interface
Controller 5050 API gateway with multi-repo support
Repo Chat 8001 AI chat service with multi-repo endpoints
Indexer 8002 File analysis service

🀝 Contributing

We welcome contributions! Here's how you can help:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development Guidelines

  • Follow Python PEP 8 style guidelines
  • Use TypeScript for frontend development
  • Maintain backward compatibility when possible
  • Add tests for new features
  • Update documentation as needed
  • Consider multi-repository implications for new features

πŸ“‹ Roadmap

  • βœ… Basic repository analysis and indexing
  • βœ… AI-powered chat interface
  • βœ… Multiple LLM support (Gemini, Claude, OpenAI)
  • βœ… Dynamic model selection with automatic provider routing
  • βœ… Universal model support (gpt-4o, claude-3.5-sonnet, o1-preview, etc.)
  • βœ… Conversation history management
  • βœ… Local repository upload via ZIP
  • βœ… Modern React UI with TypeScript
  • βœ… Docker containerization
  • βœ… Multi-repository support with optimized pipeline
  • βœ… Modern glass-morphism UI with animations
  • βœ… Enhanced Gemini Context Caching with conflict resolution
  • βœ… Thread-safe session management
  • πŸ”„ Add support for anthropic extended context caching
  • πŸ”„ Even more Advanced RAG techniques for better cross-repository context
  • πŸ”„ File browser for multi-repository exploration
  • πŸ”„ Code generation and modification capabilities across repositories
  • πŸ”„ Integration with IDEs and editors
  • πŸ”„ Team collaboration features with shared repository collection 84E7 s
  • πŸ”„ Repository dependency analysis and visualization
  • πŸ”„ Advanced repository comparison and diff features

πŸ› Troubleshooting

Common Issues

  1. Services not starting: Check that all required ports are available
  2. API errors: Verify your API keys are correctly set in .env
  3. Repository analysis fails: Ensure the repository URL is accessible
  4. Docker build fails: Make sure you have sufficient disk space
  5. Multi-repository conflicts: Check repository manager for status indicators
  6. Context caching errors: Verify Gemini API key and check cache management

Model-Specific Issues

  1. "API key required" errors:

    • For gpt-* or o* models: Configure OPENAI_API_KEY
    • For claude-* models: Configure ANTHROPIC_API_KEY
    • For gemini-* models: Configure GEMINI_API_KEY
  2. Model not recognized:

    • Check the model name spelling
    • Verify the model follows supported naming patterns
    • Unknown models automatically default to Gemini
  3. Model switching not working:

    • Clear your browser cache
    • Check the model selector shows your typed value
    • Verify the correct API key is configured for the model type

Multi-Repository Specific Issues

  1. Repository not appearing: Check the repository manager status and error messages
  2. Queries not working across repositories: Ensure repositories are toggled "active"
  3. Cache conflicts: Repository names are automatically made unique with timestamps
  4. Performance issues: Consider reducing the number of active repositories for large queries

Getting Help

  • Check the Issues page
  • Review the logs: make logs
  • Test services: python test_services.py
  • Review the detailed MULTI_REPO_ARCHITECTURE.md for technical details

πŸ“„ License

This project is licensed under the terms specified in the license file.

πŸ™ Acknowledgments

  • Built with ❀️ using modern web technologies and advanced AI capabilities
  • Powered by Google Gemini Context Caching for optimal performance
  • Inspired by the need for better multi-repository code documentation and understanding
  • Special thanks to the open-source community for the amazing tools and frameworks

Happy Multi-Repository Coding! πŸš€

Transform your development workflow with OpenDeepWiki's powerful multi-repository AI assistance. Whether you're working on microservices, multiple projects, or complex codebases, OpenDeepWiki helps you understand and navigate your code like never before.

For questions, issues, or contributions, please visit our GitHub repository.

About

A Generic Agent for Software Engineering.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 59.6%
  • JavaScript 25.1%
  • CSS 9.4%
  • Jinja 2.7%
  • HTML 1.5%
  • Makefile 1.3%
  • Other 0.4%
0