Starred repositories
8000This is a simple demonstration of more advanced, agentic patterns built on top of the Realtime API.
A generative speech model for daily dialogue.
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
Instant voice cloning by MIT and MyShell. Audio foundation model.
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
A one stop repository for generative AI research updates, interview resources, notebooks and much more!
Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>
simple web-based, free and open-source visual novel editor that can be used in a web browser. It is written in JavaScript without using any third party libraries and thus does not require additiona…
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
A multi-voice TTS system trained with an emphasis on quality
Code and dataset for photorealistic Codec Avatars driven from audio
R2-Talker: Realistic Real-Time Talking Head Synthesis with Hash Grid Landmarks Encoding and Progressive Multilayer Conditioning
[CVPR 2024] Official repository for "MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model"
ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型
Awesome multilingual OCR and Document Parsing toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools,…
The most scalable and reliable MQTT broker for AI, IoT, IIoT and connected vehicles
[CVPR'24 Highlight] Official PyTorch implementation of CoDeF: Content Deformation Fields for Temporally Consistent Video Processing
FaceChain is a deep-learning toolchain for generating your Digital-Twin.
There was no free Dictionary API on the web when I wanted one for my friend, so I created one.