A summary on 3D human pose estimation
- Recovering 3D Human Mesh from Monocular Images: A Survey
Yating Tian, Hongwen Zhang, Yebin Liu, Limin Wang
-
[SCAPE] SCAPE: Shape Completion and Animation of People. D. Anguelov, P. Srinivasan, D. Koller, S. Thrun, J. Rodgers, and J. Davis. ACM Trans. Graphics, 2005
One Sentence Summary: First body model disentangling human body into rigid transformation of pose, id-related shape, and pose-related shape. -
[SMPL] SMPL: A Skinned Multi-Person Linear Model. Loper, Matthew and Mahmood, Naureen and Romero, Javier and Pons-Moll, Gerard and Black, Michael J. ACM Trans. Graphics, 2015
One Sentence Summary: The most widely-used body model which can be easily used in rendering engines for animation (with bones). -
[SMPL-X] Expressive Body Capture: 3D Hands, Face, and Body from a Single Image Pavlakos, Georgios and Choutas, Vasileios and Ghorbani, Nima and Bolkart, Timo and Osman, Ahmed A. A. and Tzionas, Dimitrios and Black, Michael J. CVPR 2019
One Sentence Summary: SMPL + MANO (hand model) + FLAME (head model) -
[STAR] STAR: A Sparse Trained Articulated Human Body Regressor Osman, Ahmed A A and Bolkart, Timo and Black, Michael J. ECCV 2020
One Sentence Summary: Disentangling the pose-related blend shapes in SMPL to per-joint pose-related blend shapes -
[DeepDaz] UltraPose: Synthesizing Dense Pose with 1 Billion Points by Human-body Decoupling 3D Model Haonan Yan, Jiaqi Chen, Xujie Zhang, Shengkai Zhang, Nianhong Jiao, Xiaodan Liang, Tianxiang Zheng. ICCV 2021
One Sentence Summary: Human body model with parameters having a specific physical meaning and decoupled with each other (based on Daz model) -
[GHUM] GHUM & GHUML: Generative 3D Human Shape and Articulated Pose Models Hongyi Xu, Eduard Gabriel Bazavan, Andrei Zanfir, William T. Freeman, Rahul Sukthankar, Cristian Sminchisescu. CVPR 2020
One Sentence Summary: Human body model with non-linear (VAEs) id-related shape and face expression embedding spaces.
-
[SMPLify] Keep it SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image. Bogo, Federica and Kanazawa, Angjoo and Lassner, Christoph and Gehler, Peter and Romero, Javier and Black, Michael J. ECCV 2016
One Sentence Summary: One optimizion-based method using the reprojection loss of keypoints as well as several regularization terms. -
[HMR] End-to-end Recovery of Human Shape and Pose. Angjoo Kanazawa, Michael J. Black, David W. Jacobs, Jitendra Malik. CVPR 2018
One Sentence Summary: Human mesh recovery using reprojection loss of keypoints and adversary training to avoid unreasonable pose. -
[] Learning to Estimate 3D Human Pose and Shape from a Single Color Image. Georgios Pavlakos, Luyang Zhu, Xiaowei Zhou, Kostas Daniilidis CVPR 2018
One Sentence Summary: Training the network with keypoint heatmaps and masks as supervision for SMPL parameters regression. -
[SPIN] Learning to Reconstruct 3D Human Pose and Shape via Model-fitting in the Loop. Kolotouros, Nikos and Pavlakos, Georgios and Black, Michael J and Daniilidis, Kostas. ICCV 2019
One Sentence Summary: HMR + SMPLify (HMR is used to inilize the body model parameters & SMPLify is used to refine these parameters. The refined parameters are further used as the surpervision for the network.) -
[] On the Continuity of Rotation Representations in Neural Networks. Yi Zhou, Connelly Barnes, Jingwan Lu, Jimei Yang, and Hao Li. CVPR 2019
One Sentence Summary: A new continuous representations for joint rotation. -
[GraphCMR] Convolutional Mesh Regression for Single-Image Human Shape Reconstruction. Nikos Kolotouros, Georgios Pavlakos, Kostas Daniilidis. CVPR 2019
One Sentence Summary: Directly regressing the meshes of human body with graph convolutions, then using meshes to regress SMPL parameters. -
[] Delving Deep into Hybrid Annotations for 3D Human Recovery in the Wild. Yu Rong, Ziwei Liu, Cheng Li, Kaidi Cao, Chen Change Loy. ICCV 2019
One Sentence Summary: A comprehensive study on the cost and effectiveness of different annotations for in-the-wild images. (Dense correspondence is effective.) -
[HoloPose] HoloPose: Holistic 3D Human Reconstruction In-The-Wild. Rıza Alp Guler, and Iasonas Kokkinos. CVPR 2019
One Sentence Summary: Regressing the body model parameters from body-part features with reprojection loss of densepose and key points. -
[DecoMR] 3D Human Mesh Regression with Dense Correspondence. Wang Zeng, Wanli Ouyang, Ping Luo, Wentao Liu, and Xiaogang Wang. CVPR 2020
One Sentence Summary: Recovering human mesh using the aligned features in UV space. -
[HKMR] Hierarchical Kinematic Human Mesh Recovery. Georgios Georgakis, Ren Li, Srikrishna Karanam, Terrence Chen, Jana Ko seck a, and Ziyan Wu. ECCV 2020
One Sentence Summary: Optimizing the SMPL body pose parameters seperatly based on different parts of body. -
[PARE] PARE: Part Attention Regressor for 3D Human Body Estimation. Muhammed Kocabas, Chun-Hao P. Huang, Otmar Hilliges, and Michael J. Black. ICCV 2021
One Sentence Summary: Handling the occlusion problem using part attention. (Attention maps are inilized by segmentation mask and trained with the 3D branch jointly.) -
[DSR] Learning to Regress Bodies from Images using Differentiable Semantic Rendering. Sai Kumar Dwivedi, Nikos Athanasiou, Muhammed Kocabas, Michael J. Black. ICCV 2021
One Sentence Summary: Using differentiable rendering to supervise the training of HMR with the semantic prior of clothes (calculated from AGORA). -
[Skeleton2Mesh] Skeleton2Mesh: Kinematics Prior Injected Unsupervised Human Mesh Recovery. Zhenbo Yu, Junjie Wang, Jingwei Xu, Bingbing Ni, Chenglong Zhao, Minsi Wang, Wenjun Zhang. ICCV 2021
One Sentence Summary: 3D human pose estimation using differentiable IK. -
[PyMAF] PyMAF: 3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop. Zhang, Hongwen and Tian, Yating and Zhou, Xinchi and Ouyang, Wanli and Liu, Yebin and Wang, Limin and Sun, Zhenan. ICCV 8000 2021
One Sentence Summary: HMR (body model parameters regression network) using mesh-aligned multi-scale features & densepose supervisions. -
[METRO] End-to-End Human Pose and Mesh Reconstruction with Transformers. Kevin Lin Lijuan Wang Zicheng Liu. CVPR 2021
One Sentence Summary: 3D human pose estimation using transformer. (3D joints/vertics locations are used as position embeddings.)
-
[HMMR] Learning 3D Human Dynamics from Video. Angjoo Kanazawa, Jason Y. Zhang, Panna Felsen, Jitendra Malik CVPR 2019
One Sentence Summary: A temporal encoder with sliding windows and a hallucinator for the current time step to predict the pose of current and adjacent frames. -
[] Exploiting temporal context for 3D human pose estimation in the wild. Anurag Arnab, Carl Doersch, and Andrew Zisserman CVPR 2019
One Sentence Summary: Using buddle adjustment to add temporal smooth for pose estimation from video. -
[] Human Mesh Recovery from Monocular Images via a Skeleton-disentangled Representation. Yu Sun, Yun Ye, Wu Liu, Wenpeng Gao, YiLi Fu, and Tao Mei ICCV 2019
One Sentence Summary: Extracting the skeleton and the rest detailed features separately, then using self-attention, temporal shuffling and adversarial training with these features to train the temporal human pose estimation model. -
[] Occlusion-aware networks for 3d human pose estimation in video. Yu Cheng, Bo Yang, Bo Wang, Wending Yan, and Robby T. Tan. ICCV 2019
One Sentence Summary: 3D human pose estimating using 2D confidence heatmaps of keypoints and optical flow. -
[VIBE] VIBE: Video Inference for Human Body Pose and Shape Estimation. Muhammed Kocabas, Nikos Athanasiou, Michael J. Black. CVPR 2020
One Sentence Summary: Temporal HMR (CNN+GRU for parameters regression & adversary training to avoid unreasonable temporal actions.) -
[iMoCap] Motion Capture from Internet Videos. Junting Dong, Qing Shuai, Yuanqing Zhang, Xian Liu, Xiaowei Zhou, Hujun Bao. ECCV 2020
One Sentence Summary: Using the multi-view videos of the same celebrity performing a specific action to reconstruct the 3d human mesh using an optimization-based method. -
[TCMR] Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video. Choi, Hongsuk and Moon, Gyeongsik and Chang, Ju Yong and Lee, Kyoung Mu. CVPR 2021
One Sentence Summary: VIBE + supervising the future and past SMPL body parameters with the overall predicted body model parameters. -
[] Uncertainty-aware human mesh recovery from video by learning part-based 3D dynamics. Gun-Hee Lee, Seong-Whan Lee. ICCV 2021
One Sentence Summary: Using part-based features with uncertainty-aware mechanism as well as the optical flow features for body model parameters prediction. -
[SmoothNet] SmoothNet: A Plug-and-Play Network for Refining Human Poses in Videos. Zeng, Ailing and Yang, Lei and Ju, Xuan and Li, Jiefeng and Wang, Jianyi and Xu, Qiang. ECCV 2022
One Sentence Summary: A plugin module to reduce the temporal jitter noises. -
[GLAMR] GLAMR: Global Occlusion-Aware Human Mesh Recovery with Dynamic Cameras. Ye Yuan, Umar Iqbal, Pavlo Molchanov, Kris Kitani, Jan Kautz. CVPR 2022
One Sentence Summary: Solving the occlusion problem with the prior of dynamic camera. -
[] Human Mesh Recovery from Multiple Shots. Georgios Pavlakos Jitendra Malik Angjoo Kanazawa CVPR 2022
One Sentence Summary: Using SPIN with smoothness term of canonical frame to get the ground-truth. Then using the ground-truth pose to train temporal HMR with transformer.
-
[] Monocular 3d pose and shape estimation of multiple people in natural scenes-the importance of multiple scene constraints. Zanfir A, Marinoiu E, Sminchisescu C. CVPR 2018
One Sentence Summary: Optimizing the pose of multiple person with the plane-groud assumption, occupancy avoidance, and temporal smoothness. -
[HMOR] HMOR: Hierarchical Multi-Person Ordinal Relations for Monocular Multi-Person 3D Pose Estimation. Can Wang, Jiefeng Li, Wentao Liu, Chen Qian, and Cewu Lu. ECCV 2020
One Sentence Summary: Improving multi-person 3D pose estimation using multi-person interaction relations considering instance and joints depth relations and body parts angle relations. -
[] Coherent Reconstruction of Multiple Humans from a Single Image. Wen Jiang, Nikos Kolotouros, Georgios Pavlakos, Xiaowei Zhou, and Kostas Daniilidis. CVPR 2020
One Sentence Summary: Training the network to estimate the SMPL parameters of all multiple persons with interpenetration loss (based on SDF) and depth-aware loss (based on instance segmentation.). -
[BMP] Body Meshes as Points. Zhang, Jianfeng and Yu, Dongdong and Liew, Jun Hao and Nie, Xuecheng and Feng, Jiashi CVPR 2021
One Sentence Summary: One-stage model to estimate multiple persons' 3D body using the similar idea as one stage detection。 -
[ROMP] Monocular, One-stage, Regression of Multiple 3D People. Sun, Yu and Bao, Qian and Liu, Wu and Fu, Yili and Michael J., Black and Mei, Tao. ICCV 2021
One Sentence Summary: One-stage model to estimate multiple persons' 3D body using the similar idea as CenterNet. -
[BEV] Putting People in their Place: Monocular Regression of 3D People in Depth. Sun, Yu and Liu, Wu and Bao, Qian and Fu, Yili and Mei, Tao and Black, Michael J.. CVPR 2022
One Sentence Summary: Age-aware SMPL model (adding age-related offsets) + estimating bird's eye view feature to help to refine the depth relationship for multiple subjects.
-
[] Resolving 3D Human Pose Ambiguities with 3D Scene Constraints. Mohamed Hassan, Vasileios Choutas, Dimitrios Tzionas and Michael J. Black ICCV 2019
One Sentence Summary: Improving the recovery of 3D human in a given 3D scene by considering the interation of human and scene layouts. -
[] The One Where They Reconstructed 3D Humans and Environments in TV Shows. Georgios Pavlakos and Ethan Weber and and Matthew Tancik and Angjoo Kanazawa ECCV 2022
One Sentence Summary: Improving the recovery of 3D human for TV shows by reconstructing the environment and estimating the camera and body scale information.
-
[BodyNet] BodyNet: Volumetric Inference of 3D Human Body Shapes. Gül Varol, Duygu Ceylan, Bryan Russell, Jimei Yang, Ersin Yumer, Ivan Laptev and Cordelia Schmid. ECCV 2018
One Sentence Summary: Volumetric Inference with the supervision of 2d & 3d keypoints, segmentations and voxelized SMPL model. -
[I2L-MeshNet] I2L-MeshNet: Image-to-Lixel Prediction Network for Accurate 3D Human Pose and Mesh Estimation from a Single RGB Image. Gyeongsik Moon and Kyoung Mu Lee. ECCV 2020
One Sentence Summary: Regressing 2d & 3d key points firstly, and then regressing the mesh directly.
-
[HMAR] Tracking People with 3D Representations. Jathushan Rajasegaran, Georgios Pavlakos, Angjoo Kanazawa, and Jitendra Malik NeurIPS 2021
One Sentence Summary: HMR + texture recovery (using appearance flow) for human tracking. -
[PHALP] Tracking People by Predicting 3D Appearance, Location and Pose. Rajasegaran, Jathushan and Pavlakos, Georgios and Kanazawa, Angjoo and Malik, Jitendra CVPR 2022
One Sentence Summary: Predicting and matching HMAR estimation for tracking in videos.
- [PHD] Predicting 3D Human Dynamics from Video. Jason Y. Zhang, Panna Felsen, Angjoo Kanazawa, Jitendra Malik ICCV 2019
One Sentence Summary: "Learning 3D Human Dynamics from Video" for 3D motion prediction.
Michael Black (Max Planck Institute for Intelligent Systems)
Yebin Liu (Tsinghua University)
Kyoung Mu Lee (Seoul National University)
Yaser Sheikh (Carnegie Mellon University, Facebook Reality Labs)
Angjoo Kanazawa (University of California, Berkeley)
Kostas Daniilidis (University of Pennsylvania)
Xiaowei Zhou (Zhejiang University)
Siyu Tang (ETH Zürich)