Lists (3)
Sort Name ascending (A-Z)
Stars
Multilingual Voice Understanding Model
Flexible audio loudness meter in Python with implementation of ITU-R BS.1770-4 loudness algorithm
Take notes with your voice and send them to Notion
Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection
GUI for a Vocal Remover that uses Deep Neural Networks.
Speech To Speech: an effort for an open-sourced and modular GPT4-o
Arabic speech recognition, classification and text-to-speech.
Noise supression using deep filtering
Look Who’s Talking: Active Speaker Detection in the Wild
Tools for handling speech data in machine learning projects.
easy-to-use implementation of the ISMIR 2013 Audio Degradation Toolbox
Controllable and fast Text-to-Speech for over 7000 languages!
A high-quality, varied ~30hr voice dataset suitable for training a TTS model
String-to-String Algorithms for Natural Language Processing
Conformer-based Metric GAN for speech enhancement
A Jax-based library for building transformers, includes implementations of GPT, Gemma, LlaMa, Mixtral, Whisper, SWin, ViT and more.
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
YSDA course in Speech Processing.
🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. 🎧👥📊 Advanced audio processing.
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.