An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 2,921 229 Updated Jun 10, 2025

backspacetg / simul_whisper

Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection

Python 62 6 Updated Mar 30, 2025

HelloFish-2016 / NoiseDetection

自定义view仪表盘，噪音检测功能

Java 23 9 Updated Sep 12, 2019

NVIDIA / NeMo-text-processing

NeMo text processing for ASR and TTS

Python 339 114 Updated Jun 10, 2025

NanmiCoder / MediaCrawler

小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频｜评论爬虫、微博帖子｜评论爬虫、百度贴吧帖子｜百度贴吧评论回复爬虫 | 知乎问答文章｜评论爬虫

Python 23,385 6,478 Updated Jun 8, 2025

pengzhendong / pysilero

Python Wrapper of Silero VAD

C++ 54 22 Updated May 8, 2025

snakers4 / silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 6,038 574 Updated Jun 11, 2025

X-LANCE / SLAM-LLM

Speech, Language, Audio, Music Processing with Large Language Model

Python 829 81 Updated Apr 24, 2025

suno-ai / bark

🔊 Text-Prompted Generative Audio Model

Jupyter Notebook 38,000 4,513 Updated Aug 19, 2024

pengzhendong / g2p-mix

Grapheme-to-Phoneme for Mixed Chinese (Mandarin or Cantonese) and English.

Python 101 12 Updated Mar 20, 2025

RVC-Boss / GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 47,511 5,230 Updated Jun 12, 2025

nltk / nltk_data

NLTK Data

Python 1,683 1,088 Updated Mar 10, 2025

MingLunHan / CIF-ColDec

[ICASSP 2022] Improving End-to-End Contextual Speech Recognition with Fine-Grained Contextual Knowledge Selection

25 3 Updated May 18, 2023

tbright17 / kaldi-dnn-ali-gop

Forced alignment and Goodness of Pronunciation (GOP) with DNN support. Bases on Kaldi.

C++ 231 87 Updated Apr 3, 2019

wenet-e2e / WeTextProcessing

Text Normalization & Inverse Text Normalization

Python 594 82 Updated Nov 11, 2024

FFmpeg / FFmpeg

Mirror of https://git.ffmpeg.org/ffmpeg.git

C 50,563 12,719 Updated Jun 12, 2025

ggml-org / whisper.cpp

Port of OpenAI's Whisper model in C/C++

C++ 40,721 4,337 Updated Jun 11, 2025

SYSTRAN / faster-whisper

Faster Whisper transcription with CTranslate2

Python 16,516 1,362 Updated Jun 2, 2025

m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 16,210 1,738 Updated Jun 8, 2025

Sharrnah / whispering

Whispering Tiger - OpenAI's whisper (and other models) with OSC and Websocket support. Allowing live transcription / translation in VRChat and Overlays in most Streaming Applications

Python 449 32 Updated Apr 20, 2025

HLTCHKUST / cantonese-asr

Python 85 12 Updated Feb 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adagio112

Block or report Adagio112

Stars

zuster / EconometricsResources

maitrix-org / Voila

SanDiegoMachineLearning / bookclub

philschmid / deep-learning-pytorch-huggingface

NVIDIA / cutlass

deepseek-ai / DeepGEMM

kvcache-ai / ktransformers

chriskohlhoff / asio

jobbole / awesome-cpp-cn

modelscope / ClearerVoice-Studio