8000 xiaoyangnihao (Mr.young) / Starred · GitHub

More Web Proxy on the site http://driver.im/

xiaoyangnihao

Follow

💭

On the way~

Mr.young xiaoyangnihao

💭

On the way~

Follow

Mainly focus end-to-end tts, nlp! V:WorldSeal

8 followers · 43 following

Starred repositories

rishikksh20 / MiniMax-TTS-pytorch

Try to replicate the architecture of MiniMaxTTS mentioned in it's technical report

31 Updated May 14, 2025

thomasgauthier / csm-hf

Implementation of Sesame's Conversational Speech Model for Hugging Face Transformers

Python 56 11 Updated May 17, 2025

nari-labs / dia

A TTS model capable of generating ultra-realistic dialogue in one pass.

Python 16,638 1,328 Updated May 28, 2025

yangdongchao / ALMTokenizer

The demo page for ALMTokenizer

Python 48 3 Updated Apr 14, 2025

bytedance / MegaTTS3

Python 5,448 385 Updated May 11, 2025

zhenye234 / LLaSA_inference

40 Updated Feb 8, 2025

SparkAudio / Spark-TTS

Spark-TTS Inference Code

Python 9,640 1,010 Updated Apr 9, 2025

THUDM / GLM-4-Voice

GLM-4-Voice | 端到端中英语音对话模型

Python 2,936 250 Updated Dec 5, 2024

mst272 / LLM-Dojo

欢迎来到 LLM-Dojo，这里是一个开源大模型学习场所，使用简洁且易阅读的代码构建模型训练框架(支持各种主流模型如Qwen、Llama、GLM等等)、RLHF框架(DPO/CPO/KTO/PPO)等各种功能。👩‍🎓👨‍🎓

Python 757 65 Updated May 19, 2025

Plachtaa / seed-vc

zero-shot voice conversion & singing voice conversion, with real-time support

Python 2,555 292 Updated Apr 20, 2025

yangdongchao / SimpleSpeech

The open source code for SimpleSpeech series

Python 138 8 Updated Oct 8, 2024

gpt-omni / mini-omni

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,330 283 Updated Nov 5, 2024

OpenT2S / LlamaVoice

LlamaVoice is a llama-based large voice generation model, providing inference and training ability.

Python 233 13 Updated Aug 26, 2024

2noise / ChatTTS

A generative speech model for daily dialogue.

Python 36,447 3,942 Updated May 23, 2025

balisujohn / tortoise.cpp

A ggml (C++) re-implementation of tortoise-tts

C++ 183 17 Updated Aug 20, 2024

open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 9,117 719 Updated May 27, 2025

EmulationAI / awesome-large-audio-models

Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.

676 42 Updated Aug 3, 2024

archinetai / audio-ai-timeline

A timeline of the latest AI models for audio generation, starting in 2023!

1,901 70 Updated Jan 4, 2024

adelacvg / ttts

Train the next generation of TTS systems.

Python 165 17 Updated Sep 13, 2024

rhasspy / piper

A fast, local neural text to speech system

C++ 9,180 718 Updated May 18, 2025

neonbjb / DL-Art-School

DLAS - A configuration-driven trainer for generative models

Python 140 168 Updated Oct 11, 2022

XinyuZhou2000 / Spoken-Dialogue

Jupyter Notebook 18 1 Updated Dec 7, 2023

zilliztech / GPTCache

Semantic cache for LLMs. Fully integrated with LangChain and llama_index.

Python 7,571 534 Updated Sep 18, 2024

mit-han-lab / streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python 6,899 383 Updated Jul 11, 2024

yangdongchao / UniAudio

The Open Source Code of UniAudio

Python 563 35 Updated Jul 22, 2024

huawei-noah / Speech-Backbones

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.

Jupyter Notebook 585 127 Updated Sep 18, 2023

haoheliu / versatile_audio_super_resolution

Versatile audio super resolution (any -> 48kHz) with AudioSR.

Python 1,453 153 Updated May 10, 2025

yl4579 / HiFTNet

HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform

Python 176 16 Updated Jan 14, 2025

shivammehta25 / Matcha-TTS

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

Jupyter Notebook 1,016 134 Updated May 26, 2025

sony / bigvsan

Pytorch implementation of BigVSAN

Python 204 18 Updated Mar 23, 2024

Starred topics

speech-synthesis

0