8000 misaka23 / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View misaka23's full-sized avatar

Block or report misaka23

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Pseudo Streaming SenseVoice with Hotwords

Python 283 31 Updated Mar 13, 2025

TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loudness normalization operations.

Python 99 16 Updated Dec 20, 2024

Target Speaker Extraction Toolkit

Python 169 17 Updated Apr 7, 2025

Deep Neural Network for Speaker Count Estimation

Python 151 34 Updated Sep 5, 2020

Whisper based Japanese subtitle generator

Jupyter Notebook 1,678 144 Updated Feb 23, 2025

Unofficial PyTorch implementation of Google AI's VoiceFilter system

Python 1,136 229 Updated Jul 25, 2024

The implementation of "X-TF-GridNet: A Time-Frequency Domain Target Speaker Extraction Network with Adaptive Speaker Embedding Fusion", which is accepted by Information Fusion.

Python 58 9 Updated Oct 17, 2024

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 2,844 224 Updated May 23, 2025

The official implementation of GTCRN, an ultra-lightweight SE model.

Python 347 59 Updated May 28, 2025

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & RFT & Dynamic Sampling & Async Agent RL)

Python 6,885 671 Updated May 29, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 8,699 1,084 Updated May 29, 2025

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

Python 179 6 Updated Mar 20, 2025

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 2,497 176 Updated May 29, 2025

Train transformer language models with reinforcement learning.

Python 13,964 1,922 Updated May 29, 2025

中文nlp解决方案(大模型、数据、模型、训练、推理)

Jupyter Notebook 3,463 407 Updated Feb 12, 2025

Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.

Python 1,584 450 Updated May 26, 2025

Simple RL training for reasoning

Python 3,594 267 Updated Apr 10, 2025

Minimal reproduction of DeepSeek R1-Zero

Python 11,834 1,488 Updated Apr 24, 2025

Fully open reproduction of DeepSeek-R1

Python 24,610 2,271 Updated May 28, 2025

🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

Python 36,303 5,254 Updated Nov 15, 2024

Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interaction…

Python 2,692 440 Updated Feb 24, 2025
Python 546 49 Updated Apr 15, 2025

百聆 是一个类似GPT-4o的语音对话机器人,通过ASR+LLM+TTS实现,集成DeepSeek R1等优秀大模型,时延低至800ms,Mac等低配置也可运行,支持打断

Python 1,247 217 Updated Mar 15, 2025

[CVPR 2025] EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation

Python 3,812 454 Updated Feb 27, 2025

Real time interactive streaming digital human

Python 5,676 862 Updated May 18, 2025

Sky-T1: Train your own O1 preview model within $450

Python 3,256 324 Updated May 18, 2025

GPT-Sovits的c++实现版本

C++ 18 2 Updated Jan 6, 2025

Scalable RL solution for advanced reasoning of language models

Python 1,582 91 Updated Mar 18, 2025

Deep Reasoning Translation via Reinforcement Learning (arXiv preprint 2025); DRT: Deep Reasoning Translation via Long Chain-of-Thought (arXiv preprint 2024)

222 9 Updated May 27, 2025
Next
0