8000 dongwon00kim / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View dongwon00kim's full-sized avatar

Block or report dongwon00kim

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 11,776 1,661 Updated May 5, 2025

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling

Python 1,127 91 Updated Mar 2, 2025

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…

Python 603 67 Updated Aug 15, 2024

Transformer(Attention Is All You Need) Implementation in Pytorch

Python 71 16 Updated Dec 2, 2022

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 13,690 1,392 Updated May 6, 2025

Korean Streaming ASR(with Denoiser and Conformer CTC)

Python 26 6 Updated Apr 28, 2024

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io

Python 67 6 Updated Sep 21, 2023

Inference code for Llama models

Python 58,207 9,761 Updated Jan 26, 2025

Helios Distribution

C# 217 37 Updated Apr 21, 2025

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

Python 7,865 787 Updated Feb 11, 2024

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…

Jupyter Notebook 21,955 2,323 Updated Mar 13, 2025

Provides an improved webinterface for use with ADS-B decoders readsb / dump1090-fa

JavaScript 1,421 260 Updated May 8, 2025

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

Jupyter Notebook 1,282 227 Updated May 21, 2023

Korean Grammar Correction Model based on LLM

Jupyter Notebook 4 3 Updated Jun 7, 2023

A latent text-to-image diffusion model

Jupyter Notebook 70,583 10,425 Updated Jun 18, 2024

Transcription, forced alignment, and audio indexing with OpenAI's Whisper

Python 1,864 202 Updated Mar 26, 2025

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 15,621 1,675 Updated May 3, 2025

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

Python 2,124 324 Updated Nov 14, 2023

Code for the paper Hybrid Spectrogram and Waveform Source Separation

Python 8,905 1,184 Updated Apr 24, 2024

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone

Jupyter Notebook 969 84 Updated Nov 4, 2024
Python 99 38 Updated Mar 24, 2023

Korean text normalization and language preparation package for LM in Kaldi-based ASR system

Python 60 20 Updated Apr 23, 2020

g2p: English Grapheme To Phoneme Conversion

Python 849 129 Updated Jan 5, 2023

An unofficial PyTorch implementation of the audio LM VALL-E

Python 2,990 416 Updated May 10, 2023

easy-to-use implementation of the ISMIR 2013 Audio Degradation Toolbox

Python 49 10 Updated Nov 19, 2019

Conformer-based Metric GAN for speech enhancement

Python 354 63 Updated May 3, 2024

Yin pitch estimator in PyTorch

Python 114 7 Updated Nov 7, 2022

Unofficial implementation of HiFi-GAN+ from the paper "Bandwidth Extension is All You Need" by Su, et al.

Python 214 26 Updated Oct 20, 2023

Original transformer paper: Implementation of Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems. 2017.

Jupyter Notebook 237 50 Updated Apr 29, 2024

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 5,746 549 Updated Mar 24, 2025
Next
0