8000 Add Multi-Armed Bandit (MAB) algorithms and evaluation utilities by zhenyuz0500 · Pull Request #833 · uber/causalml · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add Multi-Armed Bandit (MAB) algorithms and evaluation utilities #833

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

zhenyuz0500
Copy link
Collaborator
@zhenyuz0500 zhenyuz0500 commented May 18, 2025

Proposed changes

This PR introduces a set of classical Multi-Armed Bandit (MAB) algorithms and supporting utilities to the CausalML codebase. The implementation includes:

  1. Core MAB algorithms in causalml/causalml/optimize/bandit.py:

    • Epsilon-Greedy
    • Upper Confidence Bound (UCB)
    • Thompson Sampling
    • LinUCB (Contextual Bandit)
    • BatchBandit wrapper for batch-mode processing
    • Each algorithm uses numpy arrays for efficient computation
  2. Simulation utilities in causalml/causalml/dataset/mab_simulation.py:

    • Functions for generating synthetic bandit data
    • Support for different reward distributions
    • Utilities for creating realistic bandit scenarios
    • Support for contextual features and arm-specific effects
  3. Evaluation metrics in causalml/causalml/metrics/mab_evaluation.py:

    • Cumulative regret calculation
    • Cumulative reward tracking
    • Arm selection frequency analysis
    • Visualization utilities for comparing algorithm performance
  4. Example notebooks:

    • mab_bandit_model_demo.ipynb: A comprehensive demonstration of all MAB algorithms, including:
      • Classic MAB models (EpsilonGreedy, UCB, ThompsonSampling)
      • Contextual MAB models (LinUCB)
      • Batch-mode Classic MAB models
      • Batch-mode Contextual MAB models
      • Performance comparison and visualization
    • mab_bandit_model_comparison.ipynb: A focused comparison of different MAB algorithms with:
      • Side-by-side performance metrics
      • Arm selection analysis
      • Cumulative reward plots
      • Exploration-exploitation trade-off analysis
  5. Unit tests in causalml/tests/test_bandit.py

This addition provides a foundation for experimentation and benchmarking of bandit algorithms within the CausalML ecosystem, supporting both research and practical applications.

Types of changes

What types of changes does your code introduce to CausalML?
Put an x in the boxes that apply

  • New feature (non-breaking change which adds functionality)
  • Bugfix (non-breaking change which fixes an issue)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation Update (if none of the other choices apply)

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

  • I have read the CONTRIBUTING doc
  • I have signed the CLA
  • Lint and unit tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)
  • Any dependent changes have been merged and published in downstream modules

Further comments

The implementation follows the existing CausalML codebase structure and style, with a focus on:

  1. Clean, well-documented code with type hints
  2. Comprehensive test coverage
  3. Efficient numpy-based computation
  4. Extensibility for future algorithm additions

The MAB algorithms are implemented as classes that inherit from a common base class, making it easy to add new algorithms in the future. The evaluation utilities provide both numerical metrics and visualization tools, making it easy to compare different algorithms' performance.

The implementation has been tested with various scenarios, including:

  • Different reward distributions
  • Varying numbers of arms (default: 4 arms)
  • Different exploration-exploitation trade-offs
  • Various time horizons (default: 10,000 iterations)
  • Batch processing with different batch sizes
  • Contextual features with different complexities

All algorithms have been benchmarked against each other to ensure they perform as expected in different scenarios. The example notebooks provide clear demonstrations of how to use each algorithm type and how to interpret their performance.

Recent improvements include:

  • Fixed constructor issues with n_arms parameter
  • Improved batch processing for contextual bandits
  • Updated the batch-mode implementation to use BatchBandit wrapper correctly
  • Fixed argument order in LinUCB updates
  • Added comprehensive example notebooks with clear documentation

@jeongyoonlee
Copy link
Collaborator

Thanks for the contribution, @zhenyuz0500. Can you add an example notebook? It will help users understand when/how to use it.

zhenyuz0500 and others added 9 commits May 18, 2025 23:34
…hms to use numpy arrays instead of DataFrames - Add new notebook example for MAB algorithms - Remove old comparison script and test file - Update test cases to match new API
Add mab_bandit_model_demo.ipynb and update mab_bandit_model_compariso…
Remove mab_bandit_example.ipynb from the PR
@jeongyoonlee jeongyoonlee added the enhancement New feature or request label May 30, 2025
@jeongyoonlee
Copy link
Collaborator

@zhenyuz0500, can you fix the lint error by running the latest black on the changes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0