8000 xiaoyangnihao (Mr.young) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View xiaoyangnihao's full-sized avatar
💭
On the way~
💭
On the way~

Block or report xiaoyangnihao

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Try to replicate the architecture of MiniMaxTTS mentioned in it's technical report

31 Updated May 14, 2025

Implementation of Sesame's Conversational Speech Model for Hugging Face Transformers

Python 56 11 Updated May 17, 2025

A TTS model capable of generating ultra-realistic dialogue in one pass.

Python 16,638 1,328 Updated May 28, 2025

The demo page for ALMTokenizer

Python 48 3 Updated Apr 14, 2025
Python 5,448 385 Updated May 11, 2025

Spark-TTS Inference Code

Python 9,640 1,010 Updated Apr 9, 2025

GLM-4-Voice | 端到端中英语音对话模型

Python 2,936 250 Updated Dec 5, 2024

欢迎来到 LLM-Dojo,这里是一个开源大模型学习场所,使用简洁且易阅读的代码构建模型训练框架(支持各种主流模型如Qwen、Llama、GLM等等)、RLHF框架(DPO/CPO/KTO/PPO)等各种功能。👩‍🎓👨‍🎓

Python 757 65 Updated May 19, 2025

zero-shot voice conversion & singing voice conversion, with real-time support

Python 2,555 292 Updated Apr 20, 2025

The open source code for SimpleSpeech series

Python 138 8 Updated Oct 8, 2024

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,330 283 Updated Nov 5, 2024

LlamaVoice is a llama-based large voice generation model, providing inference and training ability.

Python 233 13 Updated Aug 26, 2024

A generative speech model for daily dialogue.

Python 36,447 3,942 Updated May 23, 2025

A ggml (C++) re-implementation of tortoise-tts

C++ 183 17 Updated Aug 20, 2024

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 9,117 719 Updated May 27, 2025

Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.

676 42 Updated Aug 3, 2024

A timeline of the latest AI models for audio generation, starting in 2023!

1,901 70 Updated Jan 4, 2024

Train the next generation of TTS systems.

Python 165 17 Updated Sep 13, 2024

A fast, local neural text to speech system

C++ 9,180 718 Updated May 18, 2025

DLAS - A configuration-driven trainer for generative models

Python 140 168 Updated Oct 11, 2022
Jupyter Notebook 18 1 Updated Dec 7, 2023

Semantic cache for LLMs. Fully integrated with LangChain and llama_index.

Python 7,571 534 Updated Sep 18, 2024

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python 6,899 383 Updated Jul 11, 2024

The Open Source Code of UniAudio

Python 563 35 Updated Jul 22, 2024

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.

Jupyter Notebook 585 127 Updated Sep 18, 2023

Versatile audio super resolution (any -> 48kHz) with AudioSR.

Python 1,453 153 Updated May 10, 2025

HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform

Python 176 16 Updated Jan 14, 2025

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

Jupyter Notebook 1,016 134 Updated May 26, 2025

Pytorch implementation of BigVSAN

Python 204 18 Updated Mar 23, 2024
Next
0