liroda

liroda

4 followers · 42 following

tts_env Public

Python Updated Mar 16, 2025
ctc_forced_aligner Public
Forked from deskpai/ctc_forced_aligner

We are open-sourcing the CTC forced aligner used in Deskpai. With focus on production-ready model inference, it supports 18 different alignment models, including multilingual models(German, English…

Python Updated Feb 9, 2025
pyannote-audio Public
Forked from pyannote/pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Jupyter Notebook MIT License Updated Nov 21, 2024
whisper-diarization Public
Forked from MahmoudAshraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Jupyter Notebook BSD 2-Clause "Simplified" License Updated Nov 14, 2024
wespeaker Public
Forked from wenet-e2e/wespeaker

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

Python Apache License 2.0 Updated Nov 14, 2024
NCSSD Public
Forked from walker-hyf/NCSSD

Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)

Python Updated Nov 1, 2024
GLM-4-Voice Public
Forked from THUDM/GLM-4-Voice

GLM-4-Voice | 端到端中英语音对话模型

Python Apache License 2.0 Updated Oct 25, 2024
awesome-diarization Public
Forked from wq2012/awesome-diarization

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

Apache License 2.0 Updated Oct 16, 2024
tiktoken Public
Forked from openai/tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

Python MIT License Updated Oct 3, 2024
audio Public
Forked from huangruizhe/audio

Data manipulation and transformation for audio signal processing, powered by PyTorch

Python BSD 2-Clause "Simplified" License Updated Sep 30, 2024
wekws Public
Forked from wenet-e2e/wekws

Production First and Production Ready End-to-End Keyword Spotting Toolkit

Python Apache License 2.0 Updated Aug 3, 2024
valle-audiodec Public
Forked from dukGuo/valle-audiodec

Inference code for Audiodec-Valle-Wenetspeech4TTS

Python MIT License Updated Jun 12, 2024
ctc-segmentation Public
Forked from lumaku/ctc-segmentation

Segment an audio file and obtain utterance alignments. (Python package)

Python Apache License 2.0 Updated May 15, 2024
Bark-Voice-Cloning Public
Forked from KevinWang676/Bark-Voice-Cloning

Bark Voice Cloning and Voice Cloning for Chinese Speech

Jupyter Notebook MIT License Updated May 11, 2024
Wav2Vec-TTS Public
Forked from MaxMax2016/Wav2Vec-TTS

FS2+FreeVC = TTS Clone

Python Updated Mar 15, 2024
MeloTTS Public
Forked from myshell-ai/MeloTTS

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

Python MIT License Updated Mar 13, 2024
polyphone Public
Forked from NewZsh/polyphone

Chinese polyphone disambiguation for Text-to-Speech application

Python Updated Mar 2, 2024
tts-frontend-dataset Public
Forked from Jackiexiao/tts-frontend-dataset

TTS FrontEnd DataSet: Polyphone / Prosody / TextNormalization

Python Apache License 2.0 Updated Feb 5, 2024
GPT-SoVITS Public
Forked from RVC-Boss/GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python MIT License Updated Jan 22, 2024
voicebox-pytorch Public
Forked from lucidrains/voicebox-pytorch

Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch

Python MIT License Updated Dec 1, 2023
Bert-VITS2_V210 Public
Forked from v3ucn/Bert-VITS2_V210

Bert-VITS2_V210 训练和推理

Python Updated Nov 29, 2023
vits_chinese Public
Forked from UEhQZXI/vits_chinese

vits chinese, tts chinese, tts mandarin 史上训练最简单，音质最好的语音合成系统

Python Updated Oct 31, 2023
VITS-fast-fine-tuning Public
Forked from Plachtaa/VITS-fast-fine-tuning

This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion

Python Apache License 2.0 Updated Oct 21, 2023
VALL-E-X Public
Forked from Plachtaa/VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io

Python MIT License Updated Oct 13, 2023
MoeGoe Public
Forked from CjangCjengh/MoeGoe

Executable file for VITS inference

Python MIT License Updated Aug 22, 2023
AudioGPT Public
Forked from AIGC-Audio/AudioGPT

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

Python Other Updated Jul 26, 2023
soft-vc Public
Forked from bshall/soft-vc

Soft speech units for voice conversion

Jupyter Notebook MIT License Updated Jul 12, 2023
speechbrain Public
Forked from speechbrain/speechbrain

A PyTorch-based Speech Toolkit

Python Apache License 2.0 Updated Jul 12, 2023
whisper Public
Forked from openai/whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Python MIT License Updated Jul 10, 2023
g2pW Public
Forked from GitYCC/g2pW

Chinese Mandarin Grapheme-to-Phoneme Converter. 中文轉注音或拼音 (INTERSPEECH 2022)

Python Apache License 2.0 Updated Jul 8, 2023

liroda

tts_env Public

Uh oh!

ctc_forced_aligner Public

Uh oh!

pyannote-audio Public

Uh oh!

whisper-diarization Public

Uh oh!

wespeaker Public

Uh oh!

NCSSD Public

Uh oh!

GLM-4-Voice Public

Uh oh!

awesome-diarization Public

Uh oh!

tiktoken Public

Uh oh!

audio Public

Uh oh!

wekws Public

Uh oh!

valle-audiodec Public

Uh oh!

ctc-segmentation Public

Uh oh!

Bark-Voice-Cloning Public

Uh oh!

Wav2Vec-TTS Public

Uh oh!

MeloTTS Public

Uh oh!

polyphone Public

Uh oh!

tts-frontend-dataset Public

Uh oh!

GPT-SoVITS Public

Uh oh!

voicebox-pytorch Public

Uh oh!

Bert-VITS2_V210 Public

Uh oh!

vits_chinese Public

Uh oh!

VITS-fast-fine-tuning Public

Uh oh!

VALL-E-X Public

Uh oh!

MoeGoe Public

Uh oh!

AudioGPT Public

Uh oh!

soft-vc Public

Uh oh!

speechbrain Public

Uh oh!

whisper Public

Uh oh!

g2pW Public

Uh oh!