-
gerzz.inc
- shanghai
- dubbing-ai.com dubbingai.io
-
TTS.cpp Public
Forked from mmwillet/TTS.cppTTS support with GGML
C++ MIT License UpdatedJun 19, 2025 -
finally_based_speech_enhancement Public
Forked from markunya/finally_based_speech_enhancementJupyter Notebook UpdatedJun 18, 2025 -
linearvc Public
Forked from kamperh/linearvcVoice conversion with just linear regression.
Jupyter Notebook MIT License UpdatedJun 18, 2025 -
-
Stream-Omni Public
Forked from ictnlp/Stream-OmniStream-Omni is an end-to-end language-vision-speech chatbot that simultaneously supports interaction across various modality combinations.
Python GNU General Public License v3.0 UpdatedJun 18, 2025 -
-
-
tts_impl Public
Forked from uthree/tts_implimplementation of text to speech models
Python MIT License UpdatedJun 17, 2025 -
FBK-fairseq1 Public
Forked from hlt-mt/FBK-fairseqRepository containing the open source code of works published at the FBK MT unit.
Python Other UpdatedJun 16, 2025 -
-
dasheng-glap Public
Forked from xiaomi-research/dasheng-glapOfficial Implementation of GLAP - General Language Audio Pretraining
Python Apache License 2.0 UpdatedJun 16, 2025 -
TS-ASR-Whisper Public
Forked from BUTSpeechFIT/TS-ASR-WhisperPython Apache License 2.0 UpdatedJun 16, 2025 -
X-Codec-2.0 Public
Forked from zhenye234/X-Codec-2.0Codec for paper: LLaSA: Scaling Train Time and Test Time Compute for LLaMA based Speech Synthesis.
Python MIT License UpdatedJun 16, 2025 -
A-DMA Public
Forked from ZhikangNiu/A-DMA[INTERSPEECH 2025]Official code for "Accelerating Diffusion-based Text-to-Speech Model Training with Dual Modality Alignment"
Python MIT License UpdatedJun 16, 2025 -
noise-robust-asr Public
Forked from debanjan06/noise-robust-asr🔊 Advanced Noise-Robust ASR System with Dynamic Adaptation Cutting-edge speech recognition system achieving 47% WER improvement in noisy conditions through novel noise-aware attention mechanisms an…
Python UpdatedJun 14, 2025 -
mkl-vc Public
Forked from alobashev/mkl-vc[Interspeech 2025] Official implementation of "Training-Free Voice Conversion with Factorized Optimal Transport"
Jupyter Notebook UpdatedJun 13, 2025 -
chatterbox Public
Forked from resemble-ai/chatterboxSoTA open-source TTS
Python MIT License UpdatedJun 13, 2025 -
MonkeyOCR Public
Forked from Yuliang-Liu/MonkeyOCRA lightweight LMM-based Document Parsing Model
Python Apache License 2.0 UpdatedJun 13, 2025 -
CosyVoice Public
Forked from FunAudioLLM/CosyVoiceLLM based TTS model, providing inference/training/deployment full-stack ability.
Python Apache License 2.0 UpdatedJun 13, 2025 -
-
LatentSync Public
Forked from bytedance/LatentSyncTaming Stable Diffusion for Lip Sync!
Python Apache License 2.0 UpdatedJun 13, 2025 -
EzAudio Public
Forked from haidog-yaqub/EzAudioHigh-quality Text-to-Audio Generation with Efficient Diffusion Transformer
Python MIT License UpdatedJun 12, 2025 -
Bert-VITS2 Public
Forked from fishaudio/Bert-VITS2vits2 backbone with bert
Python GNU Affero General Public License v3.0 UpdatedJun 11, 2025 -
ClearerVoice-Studio Public
Forked from modelscope/ClearerVoice-StudioClearVoice
-
ComfyUI_MegaTTS3 Public
Forked from billwuhao/ComfyUI_MegaTTS3Lightweight and Efficient, 🎧Ultra High-Quality Voice Cloning, Chinese and English.
Python Apache License 2.0 UpdatedJun 11, 2025 -
MSenC Public
Forked from kimtaesu24/MSenC[INTERSPEECH'25] Official repository for "Towards Human-like Multimodal Conversational Agent by Generating Engaging Speech"
Python UpdatedJun 10, 2025 -
CMSP-ST Public
Forked from Akito-Go/CMSP-STCMSP-ST: Cross-modal Mixup with Speech Purification for End-to-End Speech Translation
Python Apache License 2.0 UpdatedJun 10, 2025 -
mbrs Public
Forked from naist-nlp/mbrsA library for minimum Bayes risk (MBR) decoding
Python MIT License UpdatedJun 10, 2025 -
ASR-TTS-paper-daily Public
Forked from halsay/ASR-TTS-paper-dailyUpdate ASR paper everyday
Python Apache License 2.0 UpdatedJun 10, 2025 -