Stars
MMU software driver for Klipper (ERCF, Tradrack, Box Turtle, Night Owl, Angry Beaver, 3MS, ...)
A deck tracker and deck manager for Hearthstone on Windows
Python module to parse Hearthstone Power.log files
Turning Mobile Phones into High Performance Klipper Host for 3D Printers Based on Native Linux
A series of large language models developed by Baichuan Intelligent Technology
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Ikaros-521 / AI-Vtuber
Forked from sandboxdream/AI-VtuberAI Vtuber是一个由 【ChatterBot/ChatGPT/claude/langchain/chatglm/text-gen-webui/闻达/千问/kimi/ollama】 驱动的虚拟主播【Live2D/UE/xuniren】,可以在 【Bilibili/抖音/快手/微信视频号/拼多多/斗鱼/YouTube/twitch/TikTok】 直播中与观众实时互动 或 直接在本地进行聊…
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
A python library for working with praat, textgrids, time aligned audio transcripts, and audio files. It is primarily used for extracting features from and making manipulations on audio files given …
Command line utility for forced alignment using Kaldi
explosion / spacy-pkuseg
Forked from lancopku/pkuseg-pythonpkuseg多领域中文分词工具; The pkuseg toolkit for multi-domain Chinese word segmentation
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Qt 之 GUI 控件使用 / 网络 / 架构原理 / 运行机制理解;DTK 重绘控件方式的框架解析;IDE 技巧之 Visual Studio / Qt Creator;此为系列文章教程
深度学习辅助漫画翻译工具, 支持一键机翻和简单的图像/文本编辑 | Yet another computer-aided comic/manga translation tool powered by deeplearning
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
🔊 Text-Prompted Generative Audio Model
💻 🤖 A summary on our attempts at using Deep Learning approaches for Emotional Text to Speech 🔈
A large-scale 7B pretraining language model developed by BaiChuan-Inc.
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
All in One Version : Youtube WAV Download, Separating Vocal, Splitting Audio, Training, and Inference Using Google Colab
A WebUI to create song covers with any RVC v2 trained AI voice from YouTube videos or audio files.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型