Lists (1)
Sort Name ascending (A-Z)
Stars
This is the GitHub page for publicly available emotional speech data.
A library for preparing data for machine translation research (monolingual preprocessing, bitext mining, etc.) built by the FAIR NLLB team.
Faster Whisper transcription with CTranslate2
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
[INTERSPEECH 2024] EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
A Pytorch Implementation of "Neural Speech Synthesis with Transformer Network"
Official PyTorch implementation of BigVGAN (ICLR 2023)
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
李宏毅2021/2022/2023春季机器学习课程课件及作业