A benchmarking framework for evaluating compression algorithms on scientific data arrays.
Latest benchmark results: https://magland.github.io/benchcompress/
Paper (WIP): https://magland.github.io/benchcompress/paper
Benchcompress is a comprehensive benchmarking framework for evaluating compression algorithms on scientific data arrays. The system follows an automated workflow:
-
Defining Components
- Algorithms are implemented in
benchcompress/src/benchcompress/algorithms/
- Datasets are defined in
benchcompress/src/benchcompress/datasets/
- Each component specifies metadata like version, tags, and compatibility requirements
- Algorithms are implemented in
-
Automated Benchmarking
- Benchmarks run automatically via GitHub Actions on pushes to main branch
- For each compatible algorithm-dataset pair, measures:
- Compression ratio
- Encoding throughput (MB/s)
- Decoding throughput (MB/s)
- Results are verified by decompressing and comparing with original data
-
Result Storage
- Results are committed to a dedicated
benchmark-results
branch - Local and remote caching system prevents redundant rerunning of benchmarks (only modified or added components are re-benchmarked)
- Caching is based on algorithm and dataset versions
- Results are committed to a dedicated
-
Web Interface
- Interactive visualization at https://magland.github.io/benchcompress/
- Filter and sort results by dataset or algorithm
- Visual charts for comparing performance metrics
- Export results to CSV for further analysis
The project consists of two main components:
benchcompress/
: Python package containing the core benchmarking framework, algorithms, and datasetsweb-ui/
: React-based web interface for visualizing benchmark results
- Install Python dependencies:
cd benchcompress
pip install -e .
benchcompress --help
benchcompress list
benchcompress run --help
- Install web UI dependencies:
cd web-ui
npm install
- Run web UI locally:
cd web-ui
npm run dev
This project uses pre-commit hooks to automatically check format code before each commit. The formatting includes:
- Python code formatting using black
- TypeScript/JavaScript code formatting using npm scripts
- C++ code formatting using clang-format
To set up the pre-commit hooks after cloning the repository:
- Install pre-commit:
pip install pre-commit
- Install the git hook scripts:
pre-commit install
After this setup, code will be automatically checked for formatting when you make a commit.
Running ./devel/format_code.sh
which will format all code in the repository.