ASVtorch is a toolkit for automatic speaker recognition.
- Complete pipelines from audio files to speaker recognition scores
- Multi-GPU training of deep embedding extractors
- Fast training of i-vector extractor with GPU
- GPU with at least 4 GB of memory (>8GB recommended)
- Preferably a computing server with many CPU cores and ample amount of RAM
- Recent Kaldi installation
- Needed in feature extraction and data augmentation
- Also used for UBM training in i-vector systems
- ffmpeg
- Python environment (installation instructions below)
- Install ffmpeg if not yet installed
- Install Kaldi if not yet installed
- http://kaldi-asr.org/doc/install.html
- Note: Augmentation scripts in Kaldi have changed over time (for example in 2019). Thus, if you encounter problems in data augmentation, try to update your Kaldi installation.
- Install a python environment (instructions below are for conda):
-
conda create -n asvtorch python=3.7
-
conda activate asvtorch
-
conda install -c pykaldi pykaldi-cpu
-
conda install pytorch=1.4 cudatoolkit -c pytorch
If you do not have cuda 10, try instead:
conda install pytorch=1.4 cudatoolkit=9.2 -c pytorch
-
conda install scipy matplotlib
-
pip install wget
-
- Clone ASVtorch repository
- Navigate to a folder where you want ASVtorch folder to be placed to
git clone https://gitlab.com/ville.vestman/asvtorch.git
cd asvtorch
- To install updates later on:
- run
git pull
inasvtorch
folder
- run
- See instructions from asvtorch/recipes/voxceleb/xvector/README.md
- For more information on how to execute and configure experiments, see asvtorch/src/settings/README.md
- To train neural networks by using multiple GPUs in parallel, see multigpu_readme.md
- To prepare custom datasets, see data_preparation_readme.md
- To create custom network architectures, see custom_architectures_readme.md
The ASVtorch toolkit is licensed under the MIT license. See LICENSE.txt. A small proportion of the codes of the toolkit are modified from the Kaldi Toolkit. These codes are marked with comments, and they are licensed under their original Apache 2.0 License.