8000 GitHub - cangcz/AnchorCrafter
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

cangcz/AnchorCrafter

Repository files navigation

AnchorCrafter

AnchorCrafter: Animate Cyber-Anchors Selling Your Products via Human-Object Interacting Video Generation

Abstract

The generation of anchor-style product promotion videos presents promising opportunities in e-commerce, advertising, and consumer engagement. Despite advancements in pose-guided human video generation, creating product promotion videos remains challenging. In addressing this challenge, we identify the integration of human-object interactions (HOI) into pose-guided human video generation as a core issue. To this end, we introduce AnchorCrafter, a novel diffusion-based system designed to generate 2D videos featuring a target human and a customized object, achieving high visual fidelity and controllable interactions. Specifically, we propose two key innovations: the HOI-appearance perception, which enhances ob 8000 ject appearance recognition from arbitrary multi-view perspectives and disentangles object and human appearance, and the HOI-motion injection, which enables complex human-object interactions by overcoming challenges in object trajectory conditioning and inter-occlusion management. Extensive experiments show that our system improves object appearance preservation by 7.5% and doubles the object localization accuracy compared to existing state-of-the-art approaches. It also outperforms existing approaches in maintaining human motion consistency and high-quality video generation.

Pipeline

Pipeline Step 1 Pipeline Step 2

News

[2025.06.17] We have open-sourced the training and inference code, along with the test dataset. The training dataset is available upon request.
[2025.04.17] We have released gradio demo.

Getting Started

Environment setup

conda create -name anchorcrafter python==3.11
pip install -r requirements.txt

Checkpoints

  1. Download DWPose model and place them at ./models/DWPose.
wget https://huggingface.co/yzd-v/DWPose/resolve/main/yolox_l.onnx?download=true -O models/DWPose/yolox_l.onnx
wget https://huggingface.co/yzd-v/DWPose/resolve/main/dw-ll_ucoco_384.onnx?download=true -O models/DWPose/dw-ll_ucoco_384.onnx
  1. Download Dinov2-large model and place them at ./models/dinov2_large.
  2. Download SVD model and place them at ./models/stable-video-diffusion-img2vid-xt-1-1.
  • You need to modify the "in_channels" parameter in your unet/config.json file.
in_channels: 8 => in_channels: 12
  1. You can download the AnchorCrafter_1.pth and place them at ./models/. This model has been fine-tuned on finutune dataset (five test objects).

Finally, all the weights should be organized in models as follows

models/
├── DWPose
│   ├── dw-ll_ucoco_384.onnx
│   └── yolox_l.onnx
├── dinov2_large
│   ├── pytorch_model.bin
│   ├── config.json
│   └── preprocessor_config.json
├── stable-video-diffusion-img2vid-xt-1-1  
└── AnchorCrafter_1.pth

Inference

A sample configuration for testing is provided at ./config. You can also easily modify the various configurations according to your needs.

sh inference.sh

Fine-tuning

We provide training scripts. Please download the finutune dataset AnchorCrafter-finutune and place them at ./dataset/tune/.

dataset/tune/
├── depth_cut
├── hand_cut
├── masked_object_cut
├── people_cut
├── video_pose
└── video_cut

Download the non-finetuned weights and place them at ./models/. The training code can be executed as:

sh train.sh

We use DeepSeed to enable multi-GPU training, requiring at least 5 GPUs with 40GB of VRAM each. Some parameters should be filled with your configuration in the sh train.sh.

Dataset

AnchorCrafter-test

We have released the test dataset AnchorCrafter-test, which includes five objects and eight human images, with each object featuring two different poses.

AnchorCrafter-400

We have collected and made available for application a fundamental HOI training dataset, AnchorCrafter-400, which comprises 400 videos. It is designed for academic research. If you wish to apply for its usage, please fill out the questionnaire.

Citation

@article{xu2024anchorcrafter,
  title={AnchorCrafter: Animate CyberAnchors Saling Your Products via Human-Object Interacting Video Generation},
  author={Xu, Ziyi and Huang, Ziyao and Cao, Juan and Zhang, Yong and Cun, Xiaodong and Shuai, Qing and Wang, Yuchen and Bao, Linchao and Li, Jintao and Tang, Fan},
  journal={arXiv preprint arXiv:2411.17383},
  year={2024}
}

Acknowledgements

Here are some great resources we benefit: Diffusers, Stability-AI , MimicMotion, SVD_Xtend

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0