8000 fengshi-cherish / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View fengshi-cherish's full-sized avatar
  • Hong Kong University of Science and Technology
  • Hong Kong

Highlights

  • Pro

Block or report fengshi-cherish

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Jupyter Notebook 54 3 Updated Jun 20, 2025

Official Implementation for the paper: A Variational Framework for Improving Naturalness in Generative Spoken Language Models

Python 14 3 Updated Jun 18, 2025

Github repository for ACL 2025 paper: Recent Advances in Speech Language Models: A Survey.

44 Updated Jun 17, 2025
Python 44 1 Updated Jun 13, 2025

Generative models for conditional audio generation

Jupyter Notebook 3 Updated Jun 18, 2025

Encode and decode audio samples to/from compressed latent representations!

Python 1 Updated Jun 18, 2025

LLM4MA: Large Language Models for Music & Audio (ISMIR 2025 Satellite Workshop)

HTML 1 Updated Jun 9, 2025

[CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

Python 947 28 Updated Jun 12, 2025

Text-To-Speech for NotebookLM

32 Updated Dec 21, 2024

Discogs-VI dataset and code

Python 12 Updated Dec 13, 2024

Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice

Python 335 44 Updated Jun 16, 2025
Python 11 Updated Jun 9, 2025

PiCoGen (Piano Cover Generation) is an academic project aimed at developing an automatic piano cover generation system.

32 2 Updated May 31, 2025

A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline

Python 155 3 Updated Dec 13, 2024

Curated list for papers, codes and resources related to Text-to-Audio (TTA) Generation

51 1 Updated Jun 2, 2025

Benchmark for evaluating TTS models on complex prosodic, expressiveness, and linguistic challenges.

Python 45 4 Updated Jun 5, 2025

Towards Fine-grained Audio Captioning with Multimodal Contextual Cues

Python 67 3 Updated Jun 8, 2025

FMA: A Dataset For Music Analysis

Jupyter Notebook 2,418 452 Updated Jan 5, 2023

Collection of scripts from mHuBERT-147.

Python 27 1 Updated Nov 19, 2024
JavaScript 3 1 Updated Jun 5, 2025

Official Repository for "Music Source Restoration"

Python 25 1 Updated Jun 1, 2025

A Neural Audio Codec (NAC) for Universal Audio

Python 37 2 Updated May 30, 2025

SoTA open-source TTS

Python 8,558 899 Updated Jun 13, 2025

在原始Apollo代码基础上修改了训练集格式以及训练过程 Improve the training set production process and the training process

Python 9 1 Updated May 30, 2025

Fork of ACE-Step for LoRA training with < 10 GB VRAM

Python 20 4 Updated Jun 13, 2025

SoloSpeech: Enhancing Intelligibility and Quality in Target Speech Extraction through a Cascaded Generative Pipeline

Python 213 27 Updated Jun 12, 2025

Simple reimplementation of Flow Matching for Generative Modeling (https://arxiv.org/abs/2210.02747) paper in PyTorch

Python 12 2 Updated Aug 10, 2024
Next
0