A PyTorch implementation of Conv-TasNet described in "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation" with Permutation Invariant Training (PIT).

Python 708 155 Updated Apr 6, 2023

Ryuk17 / SpeechAlgorithms

You can find the speech algorithms you want here

C 806 249 Updated Jan 1, 2025

pyannote / pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Jupyter Notebook 7,596 883 Updated May 21, 2025

ZitengWang / python_kaldi_features

python codes to extract MFCC and FBANK speech features for Kaldi

Python 65 18 Updated Nov 28, 2018

SIP-Lab / CNN-VAD

A Convolutional Neural Network based Voice Activity Detector for Smartphones

Jupyter Notebook 71 23 Updated Apr 30, 2019

PaddlePaddle / PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…

Python 11,936 1,913 Updated May 26, 2025

manojpamk / pytorch_xvectors

Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196

Python 316 64 Updated Nov 11, 2020

xiangxyq / 3gpp_vad

3gpp协议26073里面的vad的移植

C 14 8 Updated Feb 14, 2019

Embedding / Chinese-Word-Vectors

100+ Chinese Word Vectors 上百种预训练中文词向量

Python 12,019 2,329 Updated Oct 30, 2023

shiyuzh2007 / ASR

Python 55 27 Updated Jun 15, 2020

SeanNaren / deepspeech.pytorch

Speech Recognition using DeepSpeech2.

Python 2,118 624 Updated Dec 13, 2022

sid0710 / audio_data_augmentation

Python 26 12 Updated Sep 14, 2017

PaddlePaddle / ERNIE

Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.

Python 6,382 1,284 Updated Aug 31, 2024

awni / speech

A PyTorch Implementation of End-to-End Models for Speech-to-Text

Python 759 177 Updated Jul 6, 2023

imcaspar / gpt2-ml

GPT2 for Multiple Languages, including pretrained models. GPT2 多语言支持, 15亿参数中文预训练模型

Python 1,714 333 Updated May 22, 2023

yang123qwe / Chinese_phonemes

Python 6 Updated May 14, 2020

nobody132 / masr

中文语音识别; Mandarin Automatic Speech Recognition;

Python 1,943 483 Updated Jul 25, 2024

XiaoMi / kaldi-onnx

Kaldi model converter to ONNX

Python 244 59 Updated Jan 27, 2023

jcsilva / docker-kaldi-android

Dockerfile for compiling Kaldi for Android.

Shell 66 24 Updated Feb 4, 2019

andyweiqiu / asr-ios-local

基于kaldi的ios本地语音识别（本地实时流）Kaldi-based ios native speech recognition (local real-time streaming)

Objective-C 72 29 Updated Sep 13, 2021

jitsi / jiwer

Evaluate your speech-to-text system with similarity measures such as word error rate (WER)

Python 728 102 Updated Feb 15, 2025

mravanelli / pytorch-kaldi

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding a…

Python 2,387 445 Updated Mar 14, 2022

kaldi-asr / kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

Shell 14,876 5,359 Updated Apr 28, 2025

GoogleCloudPlatform / python-docs-samples

Code samples used on cloud.google.com

Jupyter Notebook 7,708 6,549 Updated May 29, 2025

flashlight / wav2letter

Facebook AI Research's Automatic Speech Recognition Toolkit

C++ 6,427 1,012 Updated Nov 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nangongmu

Block or report nangongmu

Stars

Danbinabo / insighrface

alumae / kaldi-gstreamer-server

wiseman / py-webrtcvad

nangongmu / deep-speaker

KrishnaDN / speech-music-noise-classification-using-pytorch

kaituoxu / Conv-TasNet