This repository demonstrates how C3PO optimizing expert pathways to improve the model performance.
Create a new conda environment named C3PO and install the required packages:
# Create conda environment
conda create -n C3PO python=3.10 -y
conda activate C3PO
# Install PyTorch (for CUDA 12.3)
conda install pytorch torchvision torchaudio pytorch-cuda=12.3 -c pytorch -c nvidia -y
# Install required packages
pip install torch numpy transformers fvcore tqdm
Download the reference cases from this anonymous link: Reference Cases
# Extract the downloaded reference.zip
unzip reference.zip -d reference_data
Run the download.sh
script to get the necessary datasets:
# Execute download script
bash download.sh
# Run the main script
python olmoe_optimizer.py