8000 superYangwenwen (yangwenwen) / Starred · GitHub

More Web Proxy on the site http://driver.im/

superYangwenwen

Follow

yangwenwen superYangwenwen

Follow

2 followers · 3 following

Stars

MontrealCorpusTools / Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi

Python 1,469 254 Updated Mar 25, 2025

AdolfVonKleist / Phonetisaurus

Phonetisaurus G2P

Shell 473 123 Updated Jun 1, 2024

coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 39,972 5,100 Updated Aug 16, 2024

espnet / espnet

End-to-End Speech Processing Toolkit

Python 9,087 2,258 Updated May 6, 2025

jishengpeng / WavTokenizer

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling

Python 1,129 92 Updated Mar 2, 2025

stevenhillis / awesome-asr-contextualization

A curated list of awesome papers on contextualizing E2E ASR outputs

77 9 Updated May 10, 2023

declare-lab / speech-adapters

Codes and datasets for our ICASSP2023 paper, Evaluating parameter-efficient transfer learning approaches on SURE benchmark for speech understanding

Python 43 8 Updated Mar 12, 2023

ga642381 / Speech-Prompts-Adapters

This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.

108 6 Updated Aug 4, 2023

TigerResearch / TigerBot

TigerBot: A multi-language multi-task LLM

Python 2,258 191 Updated Dec 28, 2024

ymcui / Chinese-LLaMA-Alpaca

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Python 18,827 1,891 Updated Apr 30, 2024

km1994 / LLMsNineStoryDemonTower

【LLMs九层妖塔】分享 LLMs在自然语言处理（ChatGLM、Chinese-LLaMA-Alpaca、小羊驼 Vicuna、LLaMA、GPT4ALL等）、信息检索（langchain）、语言合成、语言识别、多模态等领域（Stable Diffusion、MiniGPT-4、VisualGLM-6B、Ziya-Visual等）等实战与经验。

2,040 200 Updated Mar 30, 2024

lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 38,550 4,702 Updated Apr 12, 2025

mosaicml / llm-foundry

LLM training code for Databricks foundation models

Python 4,242 559 Updated May 13, 2025

FreedomIntelligence / LLMZoo

⚡LLM Zoo is a project that provides data, models, and evaluation benchmark for large language models.⚡

Python 2,943 198 Updated Nov 26, 2023

yuanzhoulvpi2017 / zero_nlp

中文nlp解决方案(大模型、数据、模型、训练、推理)

Jupyter Notebook 3,439 405 Updated Feb 12, 2025

facebookresearch / av_hubert

A self-supervised learning framework for audio-visual speech

Python 902 141 Updated Dec 7, 2023

danmic / av-se

Deep-Learning-Based Audio-Visual Speech Enhancement and Separation

208 22 Updated Apr 16, 2023

SpeechColab / Leaderboard

SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.

Python 500 65 Updated Mar 29, 2025

syhw / wer_are_we

Attempt at tracking states of the arts and recent results (bibliography) on speech recognition.

1,862 225 Updated Jun 27, 2022

kimiyoung / transformer-xl

Python 3,646 764 Updated Sep 21, 2022

joongbo / tta

Repository for the paper "Fast and Accurate Deep Bidirectional Language Representations for Unsupervised Learning"

Python 109 20 Updated Nov 9, 2020

microsoft / MPNet

MPNet: Masked and Permuted Pre-training for Language Understanding https://arxiv.org/pdf/2004.09297.pdf

Python 294 33 Updated Sep 11, 2021

zihangdai / xlnet

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Python 6,188 1,174 Updated May 28, 2023

hirofumi0810 / neural_sp

End-to-end ASR/LM implementation with PyTorch

Python 596 139 Updated Aug 30, 2021

dr-pato / audio_visual_speech_enhancement

Face Landmark-based Speaker-Independent Audio-Visual Speech Enhancement in Multi-Talker Environments

Python 107 25 Updated Mar 19, 2024

DemisEom / SpecAugment

A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain

Python 648 135 Updated Apr 5, 2022

wiseman / py-webrtcvad

Python interface to the WebRTC Voice Activity Detector

C 2,231 417 Updated Jul 4, 2024

okdalto / VisualizeMNIST

This project is real-time visualization of a network recognizing digits from user's input.

Processing 584 67 Updated Dec 30, 2019

VIPL-Audio-Visual-Speech-Understanding / Lipreading-DenseNet3D

DenseNet3D Model In "LRW-1000: A Naturally-Distributed Large-Scale Benchmark for Lip Reading in the Wild", https://arxiv.org/abs/1810.06990

Python 118 21 Updated Dec 10, 2020

NirHeaven / D3D

The proposed method in LRW-1000: A Naturally-Distributed Large-Scale Benchmark for Lip Reading in the Wild

Python 25 11 Updated Nov 23, 2018

0