Starred repositories
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning.
BlenderLLM: A LLM specifically designed to generate CAD scripts based on user instructions. These scripts are then executed in Blender to render 3D models.
在`TRAE` 或 `Cursor` 编辑器中,输入用户需求,通过 `AI` 自动生成 `CAD` 图纸并免费下载成`DWG`格式的文件。
Make SkyReels-A2 avialbe in ComfyUI.
[ACM MM 2025] FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
Awesome multilingual OCR and Document Parsing toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools,…
[SIGGRAPH 2025] LAM: Large Avatar Model for One-shot Animatable Gaussian Head
chinese speech pretrained models
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
🔥 [ICCV 2025] InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
A docker free offline version for HeyGem; Python and Linux is all you need!
Official implementation of "FitDiT: Advancing the Authentic Garment Details for High-fidelity Virtual Try-on"
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
AudioDVP:Photorealistic Audio-driven Video Portraits
Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis
Bring portraits to life in Real Time!onnx/tensorrt support!实时肖像驱动!
[ACM MM 2025] Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis
[NeurIPS'24] HippoRAG is a novel RAG framework inspired by human long-term memory that enables LLMs to continuously integrate knowledge across external documents. RAG + Knowledge Graphs + Personali…
GUI for a Vocal Remover that uses Deep Neural Networks.
MFCC-based LipSync plug-in for Unity using Job System and Burst Compiler
Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interaction…
Official implementation of "Sonic: Shifting Focus to Global Audio Perception in Portrait Animation"
Real time interactive streaming digital human