Stars
No fortress, purely open ground. OpenManus is Coming.
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, Mistral Small 3.1 and other large language models.
An AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, web scraping, and large language models. The goal of this repo is to provide the si…
Fully open reproduction of DeepSeek-R1
[SIGGRAPH Asia 2024, Best Paper Honorable Mention] This is the official implementation of our SIGGRAPH Asia journal artical: TEXGen: a Generative Diffusion Model for Mesh Textures
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Tencent Hunyuan3D-1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation
Depth Any Video with Scalable Synthetic Data (ICLR 2025)
Official PyTorch implementation of "Expressive Whole-Body 3D Gaussian Avatar", ECCV 2024.
Code for paper "Gaussian Garments: Reconstructing Simulation-Ready Clothing with Photo-Realistic Appearance from Multi-View Video"
[SIGGRAPH Asia 2024] PuzzleAvatar: Assembling 3D Avatars from Personal Albums
Simulating the Real World: Survey & Resources, which contains our survey "Simulating the Real World: A Unified Survey of Multimodal Generative Models" and Awesome-Text2X-Resources. Watch this repos…
Vuer is a 3D visualization tool for robotics and VR applications.
[CoRL 2024] Open-TeleVision: Teleoperation with Immersive Active Visual Feedback
This repository implements teleoperation of the Unitree humanoid robot using XR Devices.
《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Speech To Speech: an effort for an open-sourced and modular GPT4-o
An open-source audio wake word (or phrase) detection framework with a focus on performance and simplicity.
Instant voice cloning by MIT and MyShell. Audio foundation model.
[NeurIPS 2023] Official Code for "SMPLer-X: Scaling Up Expressive Human Pose and Shape Estimation"
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
[SIGGRAPH Asia'24 & TOG] Gaussian Opacity Fields: Efficient Adaptive Surface Reconstruction in Unbounded Scenes
A Modular Framework for 3D Gaussian Splatting and Beyond
[CVPR 2024 Highlight] Official repository for paper "SIFU: Side-view Conditioned Implicit Function for Real-world Usable Clothed Human Reconstruction"