An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 3,018 243 Updated Jul 2, 2025

Stability-AI / stable-codec

A family of state-of-the-art Transformer-based audio codecs for low-bitrate high-quality audio coding.

Python 375 23 Updated May 30, 2025

ga642381 / speech-trident

Awesome speech/audio LLMs, representation learning, and codec models

1,058 63 Updated Jun 27, 2025

pytorch-labs / attention-gym

Helpful tools and examples for working with flex-attention

Python 854 53 Updated Jun 23, 2025

facebookresearch / lingua

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,634 263 Updated Jun 18, 2025

jingzhunxue / flow_mirror

flow mirror models from JZX AI Labs

Python 44 2 Updated Sep 30, 2024

kyutai-labs / moshi

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 8,521 717 Updated Jul 2, 2025

dmort27 / epitran

A tool for transcribing orthographic text as IPA (International Phonetic Alphabet)

Python 729 147 Updated Apr 18, 2025

alxndrTL / mamba.py

A simple and efficient Mamba implementation in pure PyTorch and MLX.

Python 1,270 110 Updated Dec 4, 2024

huggingface / parler-tts

Inference and training library for high-quality TTS models.

Python 5,330 569 Updated Dec 10, 2024

AudiogenAI / agc

Audiogen Codec

Python 140 12 Updated Jul 9, 2024

hubertsiuzdak / snac

Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate

Python 620 32 Updated Nov 19, 2024

huggingface / llm-swarm

Manage scalable open LLM inference endpoints in Slurm clusters

Python 262 26 Updated Jul 11, 2024

huggingface / lighteval

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

Python 1,675 301 Updated Jul 2, 2025

LAION-AI / natural_voice_assistant

Python 489 42 Updated May 27, 2024

ZhangXInFD / SpeechTokenizer

This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on

Python 576 53 Updated Jun 9, 2024

Plachtaa / VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

Python 7,888 789 Updated Feb 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yoach Lacombe ylacombe

Achievements

Achievements

Block or report ylacombe

Stars

tadeephuy / GradientReversal

pytorch / torchtitan

lucadellalib / focalcodec

pipecat-ai / smart-turn

deepseek-ai / smallpond

deepseek-ai / 3FS

MekkCyber / CutlassAcademy

huggingface / smolagents

HomebrewML / HeavyBall

kyutai-labs / yomikomi

kyutai-labs / sphn

zhenye234 / X-Codec-2.0

thuhcsi / SpeechCraft

modelscope / ClearerVoice-Studio