jaehyun-ko

🎯

Focusing

Jaehyun jaehyun-ko

🎯

Focusing

Ph.D Student at Sogang Graduate School, Korea. Interest: Multimodal(Audio-Visual) Speech Recog/Synth/Enhance

49 followers · 154 following

Seoul,Korea
14:09 (UTC +09:00)
https://velog.io/@jhko

Achievements

Highlights

Lists (13)

Sort

ASR

4 repositories

AudioEdit

1 repository

AVSR

Starred repositories

rishikksh20 / iSTFTNet-pytorch

iSTFTNet : Fast and Li 10000 ghtweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform

Python 254 48 Updated Mar 14, 2023

NVIDIA / elucidated-text-to-audio

Elucidated Text-To-Audio (ETTA) is a SOTA text-to-audio model with a holistic understanding of the design space and trained with synthetic captions.

Python 15 2 Updated Jun 30, 2025

LAION-AI / emotion-annotations

Python 34 4 Updated Jun 28, 2025

NVlabs / DDO

[ICML 2025 Spotlight] Direct Discriminative Optimization: Supercharging Diffusion/Autoregressive with GAN-type Discrimination

Python 66 Updated Jun 22, 2025

rishikksh20 / MiniMax-TTS-pytorch

Try to replicate the architecture of MiniMaxTTS mentioned in it's technical report

34 Updated May 14, 2025

GraphiteEditor / Graphite

An open source graphics editor for 2025: comprehensive 2D content creation tool suite for graphic design, digital art, and interactive real-time motion graphics — featuring node-based procedural ed…

Rust 17,342 752 Updated Jul 2, 2025

google-gemini / gemini-cli

An open-source AI agent that brings the power of Gemini directly into your terminal.

TypeScript 48,679 4,044 Updated Jul 2, 2025

ExtensityAI / symbolicai

Compositional Differentiable Programming Library

Python 1,440 67 Updated Jun 28, 2025

s3prl / s3prl

Self-Supervised Speech Pre-training and Representation Learning Toolkit

Python 2,422 504 Updated Jun 13, 2025

stanford-oval / storm

An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.

Python 25,830 2,330 Updated Jun 27, 2025

ChenglongMa / zoplicate

A plugin that does one thing only: Detect and manage duplicate items in Zotero.

TypeScript 586 6 Updated Mar 24, 2025

kyegomez / VLM-Mamba

We introduce VLM-Mamba, the first Vision-Language Model built entirely on State Space Models (SSMs), specifically leveraging the Mamba architecture.

Python 5 Updated Jun 30, 2025

Yu-Fangxu / FoR

[ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples

PDDL 97 7 Updated Jun 8, 2025

OpenCut-app / OpenCut

The open-source CapCut alternative

TypeScript 6,373 458 Updated Jul 1, 2025

VectorSpaceLab / OmniGen2

OmniGen2: Exploration to Advanced Multimodal Generation.

Jupyter Notebook 2,740 213 Updated Jul 1, 2025

stevenhillis / awesome-asr-contextualization

A curated list of awesome papers on contextualizing E2E ASR outputs

78 9 Updated May 10, 2023

metame-ai / awesome-audio-plaza

Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generation

391 17 Updated Jun 23, 2025

zhouweilian1904 / Mamba-in-Mamba

Python 49 3 Updated Apr 28, 2025

k2-fsa / ZipVoice

Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching

Python 209 20 Updated Jul 2, 2025

arnetheduck / nim-fftr

The fastest Fourier transform in the Rhein (so far). Pure Nim.

Nim 38 1 Updated Jan 13, 2024

SUNGBEOMCHOI / Korean-Streaming-ASR

Korean Streaming ASR(with Denoiser and Conformer CTC)

Python 25 6 Updated Apr 28, 2024

GeeeekExplorer / nano-vllm

Nano vLLM

Python 4,743 546 Updated Jun 27, 2025

lucidrains / neat

Explorations into NEAT and some of its derivative research

Python 19 Updated Jun 30, 2025

ajaybati / miipher2.0

Reimplementation of Miipher

Jupyter Notebook 22 3 Updated Aug 16, 2023

haiciyang / Genhancer

Official repo of INTERSPEECH 2024 paper Genhancer: High-Fidelity Speech Enhancement via Generative Modeling on Discrete Codec Tokens. This repo provides additional audio samples.

3 Updated Jan 7, 2025

yaoxunji / gen-se

GenSE: Generative Speech Enhancement via Language Models using Hierarchical Modeling

Python 142 21 Updated Feb 28, 2025

facebook / pyrefly

A fast type checker and IDE for Python

Rust 3,157 116 Updated Jul 2, 2025

fgnt / nara_wpe

Different implementations of "Weighted Prediction Error" for speech dereverberation

Python 524 165 Updated Mar 19, 2025

apple / container

A tool for creating and running Linux containers using lightweight virtual machines on a Mac. It is written in Swift, and optimized for Apple silicon.

Swift 16,282 317 Updated Jul 2, 2025

apple / containerization

Containerization is a Swift package for running Linux containers on macOS.

Swift 7,403 161 Updated Jul 2, 2025

Jaehyun jaehyun-ko

Highlights

Lists (13)

ASR

AudioEdit

AVSR

fonts

Full-Duplex

Generative Models

KWS

medi

Neural Audio Codec

SE

Services

Survey

TTS

Starred repositories

Google