Stars
The C++ Core Guidelines are a set of tried-and-true guidelines, rules, and best practices about coding in C++
A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.
Self-contained, minimalistic implementation of diffusion models with Pytorch.
This repository contains demos I made with the Transformers library by HuggingFace.
A video translation and dubbing tool powered by LLMs, offering professional-grade translations and one-click full-process deployment. It can generate content optimized for platforms like YouTube,T…
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS …
A Python package that makes it easy for developers to create AI apps powered by various AI providers.
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
Easily fine-tune, evaluate and deploy Qwen3, DeepSeek-R1, Llama 4 or any open source LLM / VLM!
152334H / tortoise-tts-fast
Forked from neonbjb/tortoise-ttsFast TorToiSe inference (5x or your money back!)