Add Multi-Armed Bandit (MAB) algorithms and evaluation utilities #833

zhenyuz0500 · 2025-05-18T00:44:34Z

Proposed changes

This PR introduces a set of classical Multi-Armed Bandit (MAB) algorithms and supporting utilities to the CausalML codebase. The implementation includes:

Core MAB algorithms in causalml/causalml/optimize/bandit.py:
- Epsilon-Greedy
- Upper Confidence Bound (UCB)
- Thompson Sampling
- LinUCB (Contextual Bandit)
- BatchBandit wrapper for batch-mode processing
- Each algorithm uses numpy arrays for efficient computation
Simulation utilities in causalml/causalml/dataset/mab_simulation.py:
- Functions for generating synthetic bandit data
- Support for different reward distributions
- Utilities for creating realistic bandit scenarios
- Support for contextual features and arm-specific effects
Evaluation metrics in causalml/causalml/metrics/mab_evaluation.py:
- Cumulative regret calculation
- Cumulative reward tracking
- Arm selection frequency analysis
- Visualization utilities for comparing algorithm performance
Example notebooks:
- mab_bandit_model_demo.ipynb: A comprehensive demonstration of all MAB algorithms, including:
  - Classic MAB models (EpsilonGreedy, UCB, ThompsonSampling)
  - Contextual MAB models (LinUCB)
  - Batch-mode Classic MAB models
  - Batch-mode Contextual MAB models
  - Performance comparison and visualization
- mab_bandit_model_comparison.ipynb: A focused comparison of different MAB algorithms with:
  - Side-by-side performance metrics
  - Arm selection analysis
  - Cumulative reward plots
  - Exploration-exploitation trade-off analysis
Unit tests in causalml/tests/test_bandit.py

This addition provides a foundation for experimentation and benchmarking of bandit algorithms within the CausalML ecosystem, supporting both research and practical applications.

Types of changes

What types of changes does your code introduce to CausalML?
Put an x in the boxes that apply

New feature (non-breaking change which adds functionality)
Bugfix (non-breaking change which fixes an issue)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation Update (if none of the other choices apply)

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

I have read the CONTRIBUTING doc
I have signed the CLA
Lint and unit tests pass locally with my changes
I have added tests that prove my fix is effective or that my feature works
I have added necessary documentation (if appropriate)
Any dependent changes have been merged and published in downstream modules

Further comments

The implementation follows the existing CausalML codebase structure and style, with a focus on:

Clean, well-documented code with type hints
Comprehensive test coverage
Efficient numpy-based computation
Extensibility for future algorithm additions

The MAB algorithms are implemented as classes that inherit from a common base class, making it easy to add new algorithms in the future. The evaluation utilities provide both numerical metrics and visualization tools, making it easy to compare different algorithms' performance.

The implementation has been tested with various scenarios, including:

Different reward distributions
Varying numbers of arms (default: 4 arms)
Different exploration-exploitation trade-offs
Various time horizons (default: 10,000 iterations)
Batch processing with different batch sizes
Contextual features with different complexities

All algorithms have been benchmarked against each other to ensure they perform as expected in different scenarios. The example notebooks provide clear demonstrations of how to use each algorithm type and how to interpret their performance.

Recent improvements include:

Fixed constructor issues with n_arms parameter
Improved batch processing for contextual bandits
Updated the batch-mode implementation to use BatchBandit wrapper correctly
Fixed argument order in LinUCB updates
Added comprehensive example notebooks with clear documentation

jeongyoonlee · 2025-05-19T04:02:15Z

Thanks for the contribution, @zhenyuz0500. Can you add an example notebook? It will help users understand when/how to use it.

…nt logic for consistency and correctness

…hms to use numpy arrays instead of DataFrames - Add new notebook example for MAB algorithms - Remove old comparison script and test file - Update test cases to match new API

…n.ipynb

Add mab_bandit_model_demo.ipynb and update mab_bandit_model_compariso…

Remove mab_bandit_example.ipynb from the PR

jeongyoonlee · 2025-05-30T16:30:31Z

@zhenyuz0500, can you fix the lint error by running the latest black on the changes?

Add Multi-Armed Bandit (MAB) algorithms and evaluation utilities

51b218e

zhenyuz0500 requested review from jeongyoonlee, huigangchen, t-tte and jpansnap May 18, 2025 01:05

zhenyuz0500 and others added 9 commits May 18, 2025 23:34

Add MAB comparison example

c03ce79

Refactor: Move MAB files, update imports, and correct bandit experime…

04777e1

…nt logic for consistency and correctness

Remove generated PNG files from PR

ca0284c

refactor: update MAB implementation and tests - Update bandit algorit…

96501d5

…hms to use numpy arrays instead of DataFrames - Add new notebook example for MAB algorithms - Remove old comparison script and test file - Update test cases to match new API

Update MAB bandit example notebook with execution results

6140c22

Add mab_bandit_model_demo.ipynb and update mab_bandit_model_compariso…

5570207

…n.ipynb

Merge pull request #1 from uber/feature/mab-algorithms-clean

4a77bb5

Add mab_bandit_model_demo.ipynb and update mab_bandit_model_compariso…

8000

Remove mab_bandit_example.ipynb from the PR

36350c0

Merge pull request #2 from uber/feature/mab-algorithms-clean

efec00e

Remove mab_bandit_example.ipynb from the PR

jeongyoonlee added the enhancement New feature or request label May 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Multi-Armed Bandit (MAB) algorithms and evaluation utilities #833

Add Multi-Armed Bandit (MAB) algorithms and evaluation utilities #833

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Add Multi-Armed Bandit (MAB) algorithms and evaluation utilities #833

Are you sure you want to change the base?

Add Multi-Armed Bandit (MAB) algorithms and evaluation utilities #833

Conversation

Uh oh!

Proposed changes

Types of changes

Checklist

Further comments

Uh oh!

Uh oh!

Uh oh!

Uh oh!