8000 PussyCat0700 (Goodman) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View PussyCat0700's full-sized avatar

Block or report PussyCat0700

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

This is the official repo for paper DiVISe: Direct Visual-Input Speech Synthesis Preserving Speaker Characteristics And Intelligibility.

Python 8 Updated Apr 29, 2025

Awesome RL Reasoning Recipes ("Triple R")

693 40 Updated Jun 16, 2025
Python 4 1 Updated Sep 21, 2024
Python 26 5 Updated Nov 7, 2023
Python 9 5 Updated Aug 3, 2021
Python 18 2 Updated Mar 2, 2024

Awesome Neural Codec Models, Text-to-Speech Synthesizers & Speech Language Models

Python 170 13 Updated Jun 18, 2025

Large Concept Models: Language modeling in a sentence representation space

Python 2,229 201 Updated Jan 29, 2025

Vector (and Scalar) Quantization, in Pytorch

Python 3,323 268 Updated Jun 16, 2025

🏆🏅 Repository for the GEB team's winning solutions in the IEEE Hybrid Energy Forecasting and Trading Competition (HEFTCom).

Python 8 Updated Jun 5, 2025

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 12,342 1,775 Updated Jun 11, 2025

A curated list of speaker-embedding speaker-verification, speaker-identification resources.

49 5 Updated Aug 12, 2021
Python 9 1 Updated May 27, 2024

AcademiCodec: An Open Source Audio Codec Model for Academic Research

Python 628 77 Updated Dec 27, 2023

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling

Python 1,158 99 Updated Mar 2, 2025

This is a Python package for NISQA.

Python 8 2 Updated Apr 9, 2024

The MOS system combines components from DNSMOS, NISQA, MOSSSL, and SIGMOS, using the librosa library to process audio waveforms.

Jupyter Notebook 25 5 Updated Feb 16, 2024

A Python library for computing the Mel-Cepstral Distance (Mel-Cepstral Distortion, MCD) between two inputs. This implementation is based on the method proposed by Robert F. Kubichek in "Mel-Cepstra…

Jupyter Notebook 53 10 Updated May 15, 2025

Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis

Python 942 110 Updated Aug 7, 2024

An open source implementation of CLIP.

Python 11,972 1,115 Updated Jun 10, 2025

Comparative Analysis of Deep Learning Approaches for Facial Age Estimation. Accepted to CVPR 2024

Python 58 3 Updated Oct 22, 2024
Python 726 140 Updated Aug 16, 2023

Collection of self-supervised models for speaker and language recognition tasks.

Jupyter Notebook 19 2 Updated Jan 18, 2022

Code for ACL 2021 paper "ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information"

Python 558 93 Updated Jul 26, 2023

ChineseBert用于中文拼写纠错

Python 41 2 Updated Mar 14, 2023

Official Code implementation for the ICLR paper "LipVoicer: Generating Speech from Silent Videos Guided by Lip Reading"

Python 68 8 Updated Sep 19, 2024

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Python 54,517 9,007 Updated May 30, 2025

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 47,781 5,258 Updated Jun 18, 2025

矩阵理论作业

TeX 3 Updated Dec 21, 2023
Next
0