This repository provides a trained aesthetic evaluation toolkit based on SongEval, the first large-scale, open-source dataset for human-perceived song aesthetics. The toolkit enables automatic scoring of generated song across five perceptual aesthetic dimensions aligned with professional musician judgments.
- 🧠 Pretrained neural models for perceptual aesthetic evaluation
- 🎼 Predicts five aesthetic dimensions:
- Overall Coherence
- Memorability
- Naturalness of Vocal Breathing and Phrasing
- Clarity of Song Structure
- Overall Musicality
- 🎧 Accepts full-length songs (vocals + accompaniment) as input
- ⚙️ Simple inference interface
Clone the repository and install dependencies:
git clone https://github.com/ASLP-lab/SongEval.git
cd SongEval
pip install -r requirements.txt
- Evaluate a single audio file:
python eval.py -i /path/to/audio.mp3 -o /path/to/output
- Evaluate a list of audio files:
python eval.py -i /path/to/audio_list.txt -o /path/to/output
- Evaluate all audio files in a directory:
python eval.py -i /path/to/audio_directory -o /path/to/output
- Force evaluation on CPU (
⚠️ CPU evaluation may be significantly slower) :
python eval.py -i /path/to/audio.wav -o /path/to/output --use_cpu
This project is mainly organized by the audio, speech and language processing lab (ASLP@NPU).
We sincerely thank the Shanghai Conservatory of Music for their expert guidance on music theory, aesthetics, and annotation design. Meanwhile, we thank AISHELL to help with the orgnization of the song annotations.
This project is released under the CC BY-NC-SA 4.0 license.
You are free to use, modify, and build upon it for non-commercial purposes, with attribution.
If you use this toolkit or the SongEval dataset, please cite the following:
@inproceedings{SongEval,
title = {SongEval: A Large-Scale Benchmark Dataset for Aesthetic Evaluation of Complete Songs},
author = {...},
booktitle = {...},
year = {2025}
}