SpokeN-100 is a novel, entirely artificially generated benchmarking dataset tailored for speech recognition, representing a core challenge in the field of tiny deep learning. SpokeN-100 consists of spoken numbers from 0 to 99 spoken by 32 different speakers in four different languages, namely English, Mandarin, German and French, resulting in 12,800 audio samples. The dataset can be downloaded via: https://zenodo.org/records/10810044.
This repository contains all code for basic data analysis of SpokeN-100. Data splits for cross-validation can be found in 'cross_validation_splits.csv'.
More details can be found in the publication:
- René Groh (rene.groh@fau.de)
- Nina Goes
- Andreas M. Kist
If you use SpokeN-100 in your research, please cite our paper:
@inproceedings{2403.09753,
title = {SpokeN-100: A Cross-Lingual Benchmarking Dataset for The Classification of Spoken Numbers in Different Languages},
author = {Groh, Ren{\'e} and Goes, Nina and Kist, Andreas M},
booktitle = {Proceedings of the TinyML Research Symposium 2024},
year = {2024},
month = {April},
doi = {10.48550/ARXIV.2403.09753},
url = {https://arxiv.org/abs/2403.09753},
keywords = {datasets, neural networks, speech processing, tiny machine learning}
}