-
-
ctc_forced_aligner Public
Forked from deskpai/ctc_forced_alignerWe are open-sourcing the CTC forced aligner used in Deskpai. With focus on production-ready model inference, it supports 18 different alignment models, including multilingual models(German, English…
Python UpdatedFeb 9, 2025 -
pyannote-audio Public
Forked from pyannote/pyannote-audioNeural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Jupyter Notebook MIT License UpdatedNov 21, 2024 -
whisper-diarization Public
Forked from MahmoudAshraf97/whisper-diarizationAutomatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
Jupyter Notebook BSD 2-Clause "Simplified" License UpdatedNov 14, 2024 -
wespeaker Public
Forked from wenet-e2e/wespeakerResearch and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
Python Apache License 2.0 UpdatedNov 14, 2024 -
NCSSD Public
Forked from walker-hyf/NCSSDGenerative Expressive Conversational Speech Synthesis (Accepted by MM'2024)
Python UpdatedNov 1, 2024 -
GLM-4-Voice Public
Forked from THUDM/GLM-4-VoiceGLM-4-Voice | 端到端中英语音对话模型
Python Apache License 2.0 UpdatedOct 25, 2024 -
awesome-diarization Public
Forked from wq2012/awesome-diarizationA curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
Apache License 2.0 UpdatedOct 16, 2024 -
tiktoken Public
Forked from openai/tiktokentiktoken is a fast BPE tokeniser for use with OpenAI's models.
Python MIT License UpdatedOct 3, 2024 -
audio Public
Forked from huangruizhe/audioData manipulation and transformation for audio signal processing, powered by PyTorch
Python BSD 2-Clause "Simplified" License UpdatedSep 30, 2024 -
wekws Public
Forked from wenet-e2e/wekwsProduction First and Production Ready End-to-End Keyword Spotting Toolkit
Python Apache License 2.0 UpdatedAug 3, 2024 -
valle-audiodec Public
Forked from dukGuo/valle-audiodecInference code for Audiodec-Valle-Wenetspeech4TTS
Python MIT License UpdatedJun 12, 2024 -
ctc-segmentation Public
Forked from lumaku/ctc-segmentationSegment an audio file and obtain utterance alignments. (Python package)
Python Apache License 2.0 UpdatedMay 15, 2024 -
Bark-Voice-Cloning Public
Forked from KevinWang676/Bark-Voice-CloningBark Voice Cloning and Voice Cloning for Chinese Speech
Jupyter Notebook MIT License UpdatedMay 11, 2024 -
Wav2Vec-TTS Public
Forked from MaxMax2016/Wav2Vec-TTSFS2+FreeVC = TTS Clone
Python UpdatedMar 15, 2024 -
MeloTTS Public
Forked from myshell-ai/MeloTTSHigh-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
Python MIT License UpdatedMar 13, 2024 -
polyphone Public
Forked from NewZsh/polyphoneChinese polyphone disambiguation for Text-to-Speech application
Python UpdatedMar 2, 2024 -
tts-frontend-dataset Public
Forked from Jackiexiao/tts-frontend-datasetTTS FrontEnd DataSet: Polyphone / Prosody / TextNormalization
Python Apache License 2.0 UpdatedFeb 5, 2024 -
GPT-SoVITS Public
Forked from RVC-Boss/GPT-SoVITS1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Python MIT License UpdatedJan 22, 2024 -
voicebox-pytorch Public
Forked from lucidrains/voicebox-pytorchImplementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch
Python MIT License UpdatedDec 1, 2023 -
Bert-VITS2_V210 Public
Forked from v3ucn/Bert-VITS2_V210Bert-VITS2_V210 训练和推理
Python UpdatedNov 29, 2023 -
vits_chinese Public
Forked from UEhQZXI/vits_chinesevits chinese, tts chinese, tts mandarin 史上训练最简单,音质最好的语音合成系统
Python UpdatedOct 31, 2023 -
VITS-fast-fine-tuning Public
Forked from Plachtaa/VITS-fast-fine-tuningThis repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion
Python Apache License 2.0 UpdatedOct 21, 2023 -
VALL-E-X Public
Forked from Plachtaa/VALL-E-XAn open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io
Python MIT License UpdatedOct 13, 2023 -
MoeGoe Public
Forked from CjangCjengh/MoeGoeExecutable file for VITS inference
Python MIT License UpdatedAug 22, 2023 -
AudioGPT Public
Forked from AIGC-Audio/AudioGPTAudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Python Other UpdatedJul 26, 2023 -
soft-vc Public
Forked from bshall/soft-vcSoft speech units for voice conversion
Jupyter Notebook MIT License UpdatedJul 12, 2023 -
speechbrain Public
Forked from speechbrain/speechbrainA PyTorch-based Speech Toolkit
Python Apache License 2.0 UpdatedJul 12, 2023 -
whisper Public
Forked from openai/whisperRobust Speech Recognition via Large-Scale Weak Supervision
Python MIT License UpdatedJul 10, 2023 -
g2pW Public
Forked from GitYCC/g2pWChinese Mandarin Grapheme-to-Phoneme Converter. 中文轉注音或拼音 (INTERSPEECH 2022)
Python Apache License 2.0 UpdatedJul 8, 2023