Stars
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…
Data manipulation and transformation for audio signal processing, powered by PyTorch
[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)
Evaluate your speech-to-text system with similarity measures such as word error rate (WER)
ASR/NLP/TTS deep learning inference library for NVIDIA Jetson using PyTorch and TensorRT
FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3
Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization
Simple Python package for breaking Russian words into syllables
The implementation for "Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System".
Speech recognition dataset based on russian audiobook, sentance-level split
Simple example how to use tensorflow's CTC loss with Voxforge speech data
Recognition of 38 speech commands in russian. Based on Yandex Cup 2021 ML Challenge: ASR
This repository contains the scripts for the models of deep unsupervised learning of vocal entrainment
luiszeni / yolact_onnx
Forked from dbolya/yolactA simple, fully convolutional model for real-time instance segmentation.