zds-potato

zds-potato

4 followers · 88 following

Achievements

Lists (4)

Sort

Automatic Speaker Verification

3 repositories

Automatic Speech Recognition

Keyword Spotting

1 repository

Speech Toolkit

Stars

wengwanjiang / USDRL

This is a repository contains the implementation of our AAAI 2025 paper USDRL: Unified Skeleton-Based Dense Representation Learning with Multi-Grained Feature Decorrelation.

Python 22 1 Updated Apr 13, 2025

PatrickHua / FeatureDecorrelationSSL

Python 17 1 Updated Mar 19, 2022

dojeon-ai / SimTPR

Code for the paper "On the Importance of Feature Decorrelation for Unsupervised Representation Learning for RL" (ICML 2023)

Python 12 2 Updated Jun 13, 2023

sunjunee / A-word-templet-for-response-letter

This is a Microsoft word templet for response letter. It will make your responses clear and formal.

6 1 Updated Mar 31, 2018

h-hg / latex-response-template

A simple Latex template for response letter

TeX 21 6 Updated May 27, 2024

BUPTLdy / A_LaTeX_Template_For_Response_Letter

TeX 28 10 Updated Nov 17, 2017

TTN-YKK / Clustering_friendly_representation_learning

Python 58 9 Updated Apr 4, 2021

AberHu / Knowledge-Distillation-Zoo

Pytorch implementation of various Knowledge Distillation (KD) methods.

Python 1,707 269 Updated Nov 25, 2021

youzhitu / confusionformer

Python 2 Updated Jun 30, 2025

nikvaessen / disjoint-mtl

Research code for "Towards multi-task learning of speech and speaker recognition" at https://arxiv.org/pdf/2302.12773.pdf

Python 12 Updated Dec 2, 2024

tiantiaf0627 / vox-profile-release

Vox-Profile Benchmark

Python 31 7 Updated Jun 11, 2025

lifeiteng / naturalspeech3_facodec

FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3

Python 206 17 Updated Apr 20, 2024

vTAD2025-Challenge / vTAD

Python 11 4 Updated May 15, 2025

ddlBoJack / MMAR

Benchmark data and code for MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix

Python 141 4 Updated Jun 6, 2025

ixxan / ug-speech

Jupyter Notebook 10 2 Updated Dec 29, 2024

ZXHY-82 / SSRL

License : CC BY-NC-SA 4.0

Python 2 2 Updated Apr 28, 2025

huggingface / distil-whisper

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

Python 3,900 333 Updated Jan 8, 2025

MoonshotAI / Kimi-Audio

Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation

Python 3,903 261 Updated Jun 21, 2025

mubingshen / MLC-SLM-Baseline

The project is associated with the recently-launched INTERSPEECH 2025 Workshop on Multilingual Conversational Speech Language Model (MLC-SLM) to provide participants with baseline systems for speec…

Python 39 5 Updated May 14, 2025

lcpmgh / colors

学术期刊配色推荐器

R 428 29 Updated Jan 27, 2025

mnabihali / FL-WavLM-with-Adapters

This repo. contains our implementation for Federated Learning with PEFT methods (e.g. Adapters) integrated with frozen WavLM

Python 3 1 Updated Apr 30, 2025

AudioLLMs / Awesome-Audio-LLM

Audio Large Language Models

Python 592 33 Updated Jul 1, 2025

NVlabs / DoRA

[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation

Python 808 59 Updated Oct 1, 2024

fengredrum / finetune-whisper-lora

Fine-Tune Whisper with Transformers and PEFT

Python 57 Updated Nov 4, 2023

umbertocappellazzo / Llama-AVSR

[ICASSP 2025] Official Pytorch implementation of "Large Language Models are Strong Audio-Visual Speech Recognition Learners".

Python 26 2 Updated Jun 26, 2025

rithiksachdev / PostASR-Correction-SLT2024

Python 14 Updated Jul 22, 2024

microsoft / LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Python 12,179 775 Updated Dec 17, 2024

ga642381 / Speech-Prompts-Adapters

This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.

110 6 Updated Aug 4, 2023

ga642381 / speech-trident

Awesome speech/audio LLMs, representation learning, and codec models

1,058 63 Updated Jun 27, 2025

fgnt / speaker_reassignment

Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment

Python 12 1 Updated Feb 5, 2025