Lists (4)
Sort Name ascending (A-Z)
Stars
This is a repository contains the implementation of our AAAI 2025 paper USDRL: Unified Skeleton-Based Dense Representation Learning with Multi-Grained Feature Decorrelation.
Code for the paper "On the Importance of Feature Decorrelation for Unsupervised Representation Learning for RL" (ICML 2023)
This is a Microsoft word templet for response letter. It will make your responses clear and formal.
A simple Latex template for response letter
Pytorch implementation of various Knowledge Distillation (KD) methods.
Research code for "Towards multi-task learning of speech and speaker recognition" at https://arxiv.org/pdf/2302.12773.pdf
FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3
Benchmark data and code for MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation
The project is associated with the recently-launched INTERSPEECH 2025 Workshop on Multilingual Conversational Speech Language Model (MLC-SLM) to provide participants with baseline systems for speec…
This repo. contains our implementation for Federated Learning with PEFT methods (e.g. Adapters) integrated with frozen WavLM
[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation
Fine-Tune Whisper with Transformers and PEFT
[ICASSP 2025] Official Pytorch implementation of "Large Language Models are Strong Audio-Visual Speech Recognition Learners".
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.
Awesome speech/audio LLMs, representation learning, and codec models
Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment