Stars
Out of time: automated lip sync in the wild
Generate ARKit expression from audio in realtime
Goliath Dataset and Official PyTorch Implementation of RelightableHands, Relightable Gaussian Codec Avatars, and Driving-Signal Aware Full-Body Avatars.
Metric depth estimation from a single image
High-resolution models for human tasks.
Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation
[CVPR'25] DepthSplat: Connecting Gaussian Splatting and Depth
Learning to Estimate Hidden Motions with Global Motion Aggregation (ICCV 2021)
Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks
Official implementation of "Sonic: Shifting Focus to Global Audio Perception in Portrait Animation"
[CVPR 2024] This is the official source for our paper "SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis"
This project implements a Wrinkle Detection application using YOLOv8 for segmentations. The application is built with Streamlit and allows users to upload images for wrinkle detection of human face…
GeneFace: Generalized and High-Fidelity 3D Talking Face Synthesis; ICLR 2023; Official code
TetSphere Splatting: Representing High-Quality Geometry with Lagrangian Volumetric Meshes
DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose Generation via Diffusion Models
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Fuse ChatTTS with OpenVoice, upload a 10-second audio clip, and clone your personalized ChatTTS voice.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Code for Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation (CVPR 2021)
Real time interactive streaming digital human