Artifact of the CENT Paper, ASPLOS 2025

This repository provides the following artifact required for the evaluation of CENT, "PIM is All You Need: A CXL-Enabled GPU-Free System for LLM Inference" paper published in ASPLOS 2025.

Dependencies

AiM Simulator is tested and verified with g++-11/12/13 and clang++-15. Python infrastructure requires pandas, matplotlib, torch, and scipy packages.

Build

Clone the repository recursively:

git clone --recursive https://github.com/Yufeng98/CENT.git
cd CENT

Install the Python packages locally or create a conda environment:

conda create -n cent python=3.10 -y
conda activate cent
pip install -r requirements.txt

Build the AiM simulator:

# use g++-11/12/13. e.g. (export CXX=/usr/bin/g++-12)
cd aim_simulator
mkdir build
cd build
cmake ..
make -j4
cd ../../

Artifact Scripts

Remove old results:

bash remove_old_results.sh

Run Simulation

Perform the end-to-end trace generation, performance simulation, and power modeling:

cd cent_simulation
# bash simulation.sh <set threads based on your platform> <sequence length gap>
bash simulation.sh 8 128
bash process_results.sh
cd ..

Note: Setting sequence length gap to 1 will start full simulation, generating token by token from 1 to 4096. Setting sequence length gap to 128 will generate tokens at index of 128, 256, ..., 4096. The final results are averaged on various sequence lengths. We show results using sequence length gap = 1, but full simulation takes long time and ~100GB disk space. For example, using 8 threads on a desktop takes ~8GB memory and ~24 hours for full simulation, and using 96 threads on a server takes ~64GB memory and ~12 hours for full simulation. For a quick verification, use sequence length gap = 128, which only takes a few hours and has minor difference in results.

Generate all figures using the following script or generate them one by one using scripts in sections below:

bash generate_figures.sh

After completing this step, CSV and PDF files for Figures 11–14 will be generated in the figures and figure_source_data directories. The figures in the published paper were created using cent_simulation/Figures.xlsx Excel file. To reproduce the figures in the same format as those in the paper, copy the content of the generated CSV files into the Data sheet of the Excel file. The figures will then be generated in the Figures sheet.

Figure 12

The CXL controller costs are broken down into die, packaging and Non Recurring Engineering (NRE) components. The die cost is derived from the wafer cost, considering the CXL controller die area and yield rate. The cost of 2D packaging is assumed to be 29% of chip cost (die and package). The NRE cost is influenced by chip production volumes.

python figure_scripts/figure_12.py

Figure 13

CENT speedup over GPU baselines. (a) Batch = 1 Latency comparison. (b) Throughput comparison under the highest batch size that CENT and GPU and achieve. (c) TCO normalized throughput comprison.

python figure_scripts/figure_13a.py
python figure_scripts/figure_13b.py
python figure_scripts/figure_13c.py

Figure 14

Analysis on Llama2-70B. (a) CENT achieves higher decoding throughputs with long context windows and 3.5K decoding sizes. (b) QoS analysis: CENT shows less query latency when achieving the similar to GPUs. (c) CENT latency breakdown with different parallelism strategies. (d) Prefill (In) and decoding (Out) latency comparison with different In/Out sizes, at maximum supported batches for both GPU and CENT.

python figure_scripts/figure_14a.py
python figure_scripts/figure_14b.py
python figure_scripts/figure_14c.py
python figure_scripts/figure_14d.py

Figure 15

Power and energy analysis. (a) Power consumption, (b) GPU SM frequency and board power, and (c) energy efficiency of CENT and GPU for different stages of Llama2 models using the maximum batch size, 512 prefill tokens and 3584 decoding tokens.

python figure_scripts/figure_15a.py
python figure_scripts/figure_15c.py

Citation

If you use CENT, please cite this paper:

Yufeng Gu, Alireza Khadem, Sumanth Umesh, Ning Liang, Xavier Servot, Onur Mutlu, Ravi Iyer, and Reetuparna Das. PIM is All You Need: A CXL-Enabled GPU-Free System for LLM Inference, In 2025 International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)

@inproceedings{cent,
  title={PIM is All You Need: A CXL-Enabled GPU-Free System for LLM Inference},
  author={Gu, Yufeng and Khadem, Alireza and Umesh, Sumanth, and Liang, Ning and Servot, Xavier and Mutlu, Onur and Iyer, Ravi and and Das, Reetuparna},
  booktitle={2025 International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)}, 
  year={2025}
}

Issues and bug reporting

We appreciate any feedback and suggestions from the community. Feel free to raise an issue or submit a pull request on Github. For assistance in using CENT, please contact: Yufeng Gu (yufenggu@umich.edu) and Alireza Khadem (arkhadem@umich.edu)

Licensing

This repository is available under a MIT license.

Acknowledgement

This work was supported in part by the NSF under the CAREER-1652294 and NSF-1908601 awards and by Intel gift.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
aim_simulator @ 2eb1ee0		aim_simulator @ 2eb1ee0
cent_simulation		cent_simulation
cost_model		cost_model
data		data
figure_scripts		figure_scripts
figure_source_data		figure_source_data
figures		figures
trace		trace
.gitignore		.gitignore
.gitmodules		.gitmodules
CENT.png		CENT.png
LICENSE		LICENSE
README.md		README.md
generate_figures.sh		generate_figures.sh
remove_old_results.sh		remove_old_results.sh
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Artifact of the CENT Paper, ASPLOS 2025

Dependencies

Build

Artifact Scripts

Run Simulation

Figure 12

Figure 13

Figure 14

Figure 15

Citation

Issues and bug reporting

Licensing

Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

3398 Uh oh!

Languages

License

Yufeng98/CENT

Folders and files

Latest commit

History

Repository files navigation

Artifact of the CENT Paper, ASPLOS 2025

Dependencies

Build

Artifact Scripts

Run Simulation

Figure 12

Figure 13

Figure 14

Figure 15

Citation

Issues and bug reporting

Licensing

Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

3398 Uh oh!

Languages

Packages