AI-powered code improvement made simple.
Evolv automatically improves your Python code using evolutionary algorithms. Just add two decorators and watch your code get better. No configuration files, no complex setup - it just works!
Transform any Python function into an evolving, self-improving program:
from evolve import evolve, main_entrypoint
@evolve(goal="Maximize classification accuracy")
def classify(X, y):
# Your initial implementation
model = RandomForestClassifier(n_estimators=10)
return model.fit(X, y)
@main_entrypoint
def main():
# Load your data and evaluate
accuracy = evaluate_model()
return {"fitness": accuracy}
# Run evolution with: EVOLVE=1 python your_script.py
- Why Evolv?
- Installation
- Core Concepts
- Usage Guide
- Architecture
- API Reference
- Examples
- Development
- Troubleshooting
- Contributing
Imagine if your code could improve itself. With Evolv, it can!
We've taken cutting-edge research from evolutionary algorithms and made it as easy to use as @functools.cache
. Whether you're optimizing ML models, algorithms, or any Python function, Evolv helps you find better solutions automatically.
- Analyzing your code and its performance metrics
- Learning from successful variations in the population
- Suggesting intelligent improvements using LLMs
- Testing changes in sandboxed environments
- Selecting the best performers for the next generation
- π― Simple API: Just two decorators to get started
- π§ Intelligent: Learns from high-performing variants
- π‘οΈ Safe: All mutations run in isolated sandboxes
- π Measurable: Track improvements across generations
- π§ Flexible: Works with any Python code and metrics
Evolv builds on amazing work from the research community and complements existing tools:
- Research Foundation: Inspired by DeepMind's AlphaEvolve and evolutionary computation research
- Works Great With: DSPy for prompting, LangChain for chains, existing ML frameworks
- Focus: Making evolutionary algorithms accessible to every developer
- Philosophy: We believe great tools should be easy to use - complexity is the enemy of adoption
- Python 3.9 or higher
- uv (recommended) or pip
- OpenRouter API key (for LLM access)
- Modal account (for sandboxed execution)
# Clone the repository
git clone https://github.com/johan-gras/evolve
cd evolve
# Install with uv (recommended)
curl -LsSf https://astral.sh/uv/install.sh | sh
uv sync --all-extras
# Or install with pip
pip install -e .
# Install Aider (AI code editor)
uv tool install aider-chat
# Install development tools (optional)
uv tool install pre-commit --with pre-commit-uv
pre-commit install
# Set up OpenRouter (required)
export OPEN_ROUTER_API_KEY="your-openrouter-api-key"
# Set up Modal (required for cloud execution)
pip install modal
modal token new
# For local execution only (no Modal needed)
export EVOLVE_EXECUTOR=local
# Run tests
make test
# Or manually
uv run pytest -v
graph LR
A[Initial Code] --> B[Evaluate Performance]
B --> C{Good Enough?}
C -->|No| D[Select Parents]
D --> E[Generate Improvements]
E --> F[Apply Modifications]
F --> G[Test Variants]
G --> B
C -->|Yes| H[Best Code]
- Population: Collection of code variants with their performance metrics
- Fitness: Measurable metric(s) that define success (accuracy, speed, etc.)
- Selection: Choosing parent programs for the next generation
- Mutation: LLM-guided code modifications
- Evaluation: Safe execution and metric collection
- Linear: Always evolve from the most recent successful variant
- Random: Select parents randomly from successful variants
- Tournament: Competition-based selection for diversity
- MAP-Elites: Maintain diversity across multiple dimensions
Evolv shines when you want to:
- π― Optimize ML Models: Automatically improve accuracy, reduce training time
- π§ Tune Algorithms: Find better parameters without grid search
- π Improve Existing Code: Let AI suggest optimizations you might miss
- π§ͺ Experiment Quickly: Try many variations without manual coding
Perfect for hackathons, prototypes, and production code that needs that extra edge!
- Decorate your target function:
@evolve(goal="Improve model accuracy")
def train_classifier(n_estimators=10):
model = RandomForestClassifier(n_estimators=n_estimators)
# ... training code ...
return model
- Create an evaluation function:
@main_entrypoint
def main():
model = train_classifier()
accuracy = evaluate_on_test_set(model)
return {"fitness": accuracy} # Must return metrics dict
- Run evolution:
# Normal execution
python your_script.py
# Evolution mode (3 iterations by default)
EVOLVE=1 python your_script.py
# Custom iterations
EVOLVE=1 EVOLVE_ITERATIONS=10 python your_script.py
Create evolve.json
in your project root:
{
"model": "openrouter/anthropic/claude-3-opus",
"temperature": 0.7,
"default_iterations": 5,
"primary_metric": "accuracy",
"executor_type": "modal"
}
from evolve.config import EvolveConfig, set_config
config = EvolveConfig(
api_key="your-key",
model="gpt-4",
temperature=0.8,
default_iterations=10
)
set_config(config)
# Core settings
export OPEN_ROUTER_API_KEY="sk-..."
export OPENROUTER_MODEL="anthropic/claude-3-opus"
export EVOLVE_ITERATIONS="5"
export PRIMARY_METRIC="f1_score"
# Execution settings
export EVOLVE_EXECUTOR="local" # or "modal"
export LOCAL_EXECUTOR_TIMEOUT="60"
export EVOLVE_PARALLEL="4" # Parallel variants
# Strategy settings
export EVOLVE_STRATEGY="tournament" # or "linear", "random", "map_elites"
If your code needs access to data files:
@evolve(
goal="Optimize data processing",
mount_dir="./data", # Mount local directory
extra_packages=["pandas", "numpy"] # Additional dependencies
)
def process_data():
# In sandbox, ./data is mounted at /mnt/user_code
df = pd.read_csv("dataset.csv") # Reads from mounted directory
return processed_df
Optimize for multiple metrics simultaneously:
@main_entrypoint
def main():
model = train_classifier()
# Return multiple metrics
return {
"fitness": accuracy, # Primary metric
"f1_score": f1,
"training_time": time,
"model_size": size
}
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β User Code β
β @evolve decorator β
ββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββ
β EvolutionCoordinator β
β β’ Orchestrates the evolution loop β
β β’ Manages async variant generation β
β β’ Tracks progress and metrics β
ββββββββ¬ββββββββββββββ¬βββββββββββββ¬βββββββββββββββ¬ββββββββ
β β β β
ββββββββΌβββββββ βββββΌβββββ ββββββΌββββββ ββββββββΌβββββββ
β DSPyModule β β Aider β β Executor β β Database β
β β’ LLM calls β β β’ Code β β β’ Runs β β β’ Stores β
β β’ Prompts β β mods β β code β β variants β
βββββββββββββββ ββββββββββ ββββββββββββ βββββββββββββββ
- Manages the evolution lifecycle
- Implements decorators (
@evolve
,@main_entrypoint
) - Coordinates async operations
- Tracks evolution progress
- Interfaces with LLMs via DSPy
- Generates improvement suggestions
- Learns from inspiration programs
- Uses Pydantic models for type safety
- Applies code modifications
- Integrates with Aider tool
- Handles AST parsing
- Manages code extraction
- Runs code in sandboxed environments
- Supports Modal (cloud) and local execution
- Captures metrics and errors
- Manages timeouts and resources
- Stores program variants
- Tracks genealogy and metrics
- Implements sampling strategies
- Maintains best performers
- Pluggable evolution algorithms
- Base strategy interface
- Built-in strategies (Linear, Random, Tournament, MAP-Elites)
- Extensible for custom strategies
- Initialization: Capture decorated function/class
- Evaluation: Run initial code, establish baseline
- Evolution Loop:
- Select parent(s) from population
- Sample high-scoring inspiration programs
- Generate improvement via LLM
- Apply changes with Aider
- Execute variant in sandbox
- Update population database
- Completion: Return best variant
Marks a function or class for evolution.
@evolve(
goal: str, # Natural language optimization goal
iterations: int = None, # Override default iterations
strategy: str = "linear", # Evolution strategy
mount_dir: str = None, # Local directory to mount
extra_packages: List[str] = None, # Additional pip packages
strategy_config: dict = None # Strategy-specific settings
)
Marks the evaluation function.
@main_entrypoint
def main() -> Dict[str, float]:
# Must return metrics dictionary
return {"fitness": score, "other_metric": value}
class EvolveConfig:
api_key: str # OpenRouter API key
model: str = "gpt-4" # LLM model
temperature: float = 0.7 # LLM temperature
default_iterations: int = 3 # Default evolution iterations
primary_metric: str = "fitness" # Metric to optimize
executor_type: str = "modal" # "modal" or "local"
"linear"
: Evolve from most recent success"random"
: Random parent selection"tournament"
: Tournament selection (size configurable)"map_elites"
: Quality-diversity optimization
from evolve.strategies import BaseStrategy
class MyStrategy(BaseStrategy):
def select_parents(self, database, count=1):
# Your selection logic
return selected_programs
from evolve import evolve, main_entrypoint
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import cross_val_score
@evolve(goal="Find optimal RandomForest hyperparameters for Iris dataset")
def create_model(n_estimators=10, max_depth=3, min_samples_split=2):
return RandomForestClassifier(
n_estimators=n_estimators,
max_depth=max_depth,
min_samples_split=min_samples_split,
random_state=42
)
@main_entrypoint
def main():
X, y = load_iris(return_X_y=True)
model = create_model()
score = cross_val_score(model, X, y, cv=5).mean()
return {"fitness": score}
if __name__ == "__main__":
main()
@evolve(
goal="Balance accuracy and inference speed",
strategy="map_elites",
strategy_config={
"features": [
("accuracy", (0.0, 1.0), 20),
("speed", (0.0, 10.0), 20)
],
"initial_population": 50
}
)
class TextClassifier:
def __init__(self, vocab_size=1000, embedding_dim=50):
self.vocab_size = vocab_size
self.embedding_dim = embedding_dim
self.model = self._build_model()
def _build_model(self):
# Model architecture that can evolve
pass
@evolve(
goal="Optimize data preprocessing for better model performance",
mount_dir="./data",
extra_packages=["pandas", "scikit-learn"]
)
def preprocess_data(df):
# Feature engineering that evolves
df['new_feature'] = df['col1'] * df['col2']
df = df.dropna(subset=['target'])
# Scaling approach that can change
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
numeric_cols = df.select_dtypes(include=['float64']).columns
df[numeric_cols] = scaler.fit_transform(df[numeric_cols])
return df
More examples in the examples/
directory:
simple_example.py
- Basic classificationregression_example.py
- Regression with custom metricsmap_elites_example.py
- Quality-diversity optimizationinspiration_demo.py
- Demonstrates learning from high-scorers
# Clone and install
git clone https://github.com/johan-gras/evolve
cd evolve
uv sync --all-extras
# Install dev tools
make install
# Run all checks before committing
make pr-ready
make help # Show all commands
make test # Run tests
make format # Format code
make lint # Check code style
make check # Run all checks
make e2e # Run end-to-end tests
make clean # Clean temporary files
evolve/
βββ src/evolve/ # Core library code
β βββ coordinator.py # Main orchestrator
β βββ strategies/ # Evolution algorithms
β βββ prompting/ # LLM prompt models
β βββ ...
βββ examples/ # Usage examples
βββ docs/ # Documentation
β βββ adr/ # Architecture decisions
β βββ development-patterns.md
β βββ error-log.md
β βββ codebase-map.md
βββ scripts/ # Utility scripts
βββ tests/ # Test suite
# Run all tests
make test
# Run with coverage
make test-cov
# Run specific test
uv run pytest src/tests/test_coordinator.py -v
# Run end-to-end tests (requires API keys)
make e2e
Please see CONTRIBUTING.md for guidelines.
Key points:
- Use
make pr-ready
before submitting PRs - Add tests for new features
- Update documentation
- Create ADRs for significant decisions
export OPEN_ROUTER_API_KEY="your-key-here"
modal token new
# Or use local execution:
export EVOLVE_EXECUTOR=local
# Ensure you're in the project directory
uv sync --all-extras
# Or: pip install -e .
# Run before pushing
make format
# Or: uv run ruff format .
- Use async execution: Set
EVOLVE_PARALLEL=4
for faster evolution - Optimize sandbox startup: Use
extra_packages
sparingly - Cache dependencies: Modal caches package installations
- Profile metrics collection: Ensure metrics code is efficient
# Enable debug logging
export EVOLVE_DEBUG=1
# Verbose LLM interactions
export EVOLVE_LOG_LEVEL=DEBUG
- Architecture Decision Records: Why key design decisions were made
- Development Patterns: Best practices and anti-patterns
- Error Log: Learn from past mistakes
- Codebase Map: Navigate the codebase efficiently
- CLAUDE.md: AI assistant instructions and tips
We welcome contributions! See CONTRIBUTING.md for guidelines.
Areas where we'd love help:
- New evolution strategies
- Performance optimizations
- Additional examples
- Documentation improvements
- Bug fixes
This project is licensed under the MIT License - see LICENSE for details.
- Inspired by Google DeepMind's AlphaEvolve
- Built with DSPy, Aider, and Modal
- Thanks to all contributors and the open-source community
EVOLVE=1 python your_script.py