Rust re-implementation of OpenFST - library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). A Python binding is also available.

Rust 161 17 Updated Apr 9, 2025

hiyouga / LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 52,892 6,477 Updated Jun 24, 2025

MoonshotAI / Kimi-Audio

Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation

Python 3,862 258 Updated Jun 21, 2025

sw005320 / espnet-1

Forked from espnet/espnet

End-to-End Speech Processing Toolkit

Python 3 Updated Jul 23, 2023

Llamacha / IWSLT2025_Quechua_data

JavaScript 3 Updated Apr 23, 2025

tencent-ailab / pika

a lightweight speech processing toolkit based on Pytorch and (Py)Kaldi

Python 341 56 Updated Dec 25, 2020

tencent-ailab / SongGeneration

Python 352 20 Updated Jun 24, 2025

Llamacha / QuBERT

Jupyter Notebook 6 3 Updated Mar 29, 2022

ta012 / PAL-AudioLLM

6 Updated Jun 13, 2025

Jackiexiao / MTTS

A Demo of Mandarin/Chinese TTS frontend

Python 279 122 Updated Apr 18, 2022

TsinghuaDatabaseGroup / AIDB

ai4db and db4ai work

786 90 Updated Dec 26, 2024

qiuqiangkong / music_llm

Python 51 4 Updated Jan 16, 2025

thu-ml / GFT

Python 35 Updated Jun 13, 2025

haidog-yaqub / MeanFlow

Pytorch Implementation (unofficial) of the paper "Mean Flows for One-step Generative Modeling" by Geng et al.

Python 468 29 Updated Jun 14, 2025

ByteDance-Seed / SeedVR

Repo for SeedVR2 & SeedVR (CVPR2025 Highlight)

Python 234 14 Updated Jun 22, 2025

01Zhangbw / Speech-and-audio-papers-Top-Conference

77 2 Updated May 25, 2025

apple / container

A tool for creating and running Linux containers using lightweight virtual machines on a Mac. It's written in Swift, and optimized for Apple silicon.

Swift 15,572 301 Updated Jun 24, 2025

lucidrains / e2-tts-pytorch

Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch

Python 481 47 Updated Mar 12, 2025

FunAudioLLM / CV3-Eval

Python 47 1 Updated Jun 13, 2025

zzw922cn / awesome-speech-recognition-speech-synthesis-papers

Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)

3,050 514 Updated Oct 19, 2023

shawnricecake / draft-attention

Code for Draft Attention

Python 74 1 Updated May 22, 2025

SYSTRAN / faster-whisper

Faster Whisper transcription with CTranslate2

Python 16,720 1,383 Updated Jun 2, 2025

ggml-org / whisper.cpp

Port of OpenAI's Whisper model in C/C++

C++ 40,995 4,379 Updated Jun 24, 2025

snap-research / GenAU

Jupyter Notebook 38 Updated Apr 13, 2025

fishaudio / Bert-VITS2

vits2 backbone with multilingual-bert

Python 8,479 1,207 Updated Jun 24, 2025

fishaudio / fish-speech

SOTA Open Source TTS

Python 22,030 1,801 Updated Jun 12, 2025

wenet-e2e / WeTextProcessing

Text Normalization & Inverse Text Normalization

Python 602 83 Updated Nov 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shylock shylockasr

Block or report shylockasr

Stars

wenet-e2e / wetts

k2-fsa / ZipVoice

k2-fsa / sherpa

garvys-org / rustfst