8000 robin1001 (Binbin Zhang) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View robin1001's full-sized avatar

Block or report robin1001

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data…

Python 20,625 2,160 Updated May 9, 2025

A lightweight data processing framework built on DuckDB and 3FS.

Python 4,613 410 Updated Mar 5, 2025

Spark-TTS Inference Code

Python 9,188 957 Updated Apr 9, 2025

OSUM: Open Speech Understanding Model, open-sourced by ASLP@NPU.

Python 362 24 Updated Apr 16, 2025

Minimal reproduction of DeepSeek R1-Zero

Python 11,721 1,483 Updated Apr 24, 2025

Reproduce R1 Zero on Logic Puzzle

Python 2,331 154 Updated Mar 20, 2025

Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics rec…

Python 952 73 Updated Mar 27, 2025

Instruction Tuning with GPT-4

HTML 4,301 306 Updated Jun 11, 2023

Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice

Python 292 37 Updated Jan 15, 2025

A Survey of Spoken Dialogue Models (60 pages)

294 16 Updated Nov 28, 2024

ncnn is a high-performance neural network inference framework optimized for the mobile platform

C++ 21,432 4,248 Updated May 9, 2025

Grapheme-to-Phoneme for Mixed Chinese (Mandarin or Cantonese) and English.

Python 96 12 Updated Mar 20, 2025

汉字转拼音(pypinyin)

Python 5,058 624 Updated Mar 30, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 46,991 7,328 Updated May 10, 2025

Target Speaker Extraction Toolkit

Python 165 16 Updated Apr 7, 2025

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 5,742 549 Updated Mar 24, 2025

Awesome speech/audio LLMs, representation learning, and codec models

983 59 Updated Apr 25, 2025

Speech, Language, Audio, Music Processing with Large Language Model

Python 802 76 Updated Apr 24, 2025

Multilingual Voice Understanding Model

Python 5,573 496 Updated Mar 23, 2025

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 13,681 1,391 Updated May 6, 2025

A generative speech model for daily dialogue.

Python 36,125 3,915 Updated May 6, 2025
Python 692 63 Updated Jun 7, 2024

Phonetisaurus G2P

Shell 473 123 Updated Jun 1, 2024

A very simple and easy to understand RISC-V core.

C 1,227 212 Updated Nov 9, 2023

Modeling, training, eval, and inference code for OLMo

Python 5,583 604 Updated May 6, 2025

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 46,075 5,070 Updated Apr 25, 2025

Official Implement of Multi-Stage Multi-Codebook (MSMC) TTS

Python 163 17 Updated Apr 10, 2024

vits2 backbone with multilingual-bert

Python 8,407 1,191 Updated May 5, 2025

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

3,846 272 Updated Apr 13, 2025

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 10,344 1,039 Updated May 8, 2025
Next
0