10000 GitHub - ankilab/SpokeN-100
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

ankilab/SpokeN-100

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SpokeN-100

SpokeN-100 is a novel, entirely artificially generated benchmarking dataset tailored for speech recognition, representing a core challenge in the field of tiny deep learning. SpokeN-100 consists of spoken numbers from 0 to 99 spoken by 32 different speakers in four different languages, namely English, Mandarin, German and French, resulting in 12,800 audio samples. The dataset can be downloaded via: https://zenodo.org/records/10810044.

Overview of data set generation and analysis.

Repository

This repository contains all code for basic data analysis of SpokeN-100. Data splits for cross-validation can be found in 'cross_validation_splits.csv'.

More details can be found in the publication:

Authors

Citation

If you use SpokeN-100 in your research, please cite our paper:

@inproceedings{2403.09753,
    title = {SpokeN-100: A Cross-Lingual Benchmarking Dataset for The Classification of Spoken Numbers in Different Languages},
    author = {Groh, Ren{\'e} and Goes, Nina and Kist, Andreas M},
    booktitle = {Proceedings of the TinyML Research Symposium 2024},
    year = {2024},
    month = {April},
    doi = {10.48550/ARXIV.2403.09753},
    url = {https://arxiv.org/abs/2403.09753},
    keywords = {datasets, neural networks, speech processing, tiny machine learning}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0