DICE-Talk

Disentangle Identity, Cooperate Emotion: Correlation-Aware Emotional Talking Portrait Generation.

🔥🔥🔥 NEWS

2025/04/29: We released the initial version of the inference code and models. Stay tuned for continuous updates!

🎥 Demo

Input	Neutral	Happy	Angry	Surprised
	1_ne.mp4	1_ha.mp4	1_an.mp4	1_su.mp4
	2_ne.mp4	2_ha.mp4	2_an.mp4	2_su.mp4

For more visual demos, please visit our Page.

📜 Requirements

It is recommended to use a GPU with 20GB or more VRAM and have an independent Python 3.10.
Tested operating system: Linux

🔑 Inference

Installtion

ffmpeg requires to be installed.
PyTorch: make sure to select the appropriate CUDA version based on your hardware, for example,

pip install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 --index-url https://download.pytorch.org/whl/cu118

Dependencies:

pip install -r requirements.txt

All models are stored in checkpoints by default, and the file structure is as follows:

DICE-Talk
  ├──checkpoints
  │  ├──DICE-Talk
  │  │  ├──audio_linear.pth
  │  │  ├──emo_model.pth
  │  │  ├──pose_guider.pth  
  │  │  ├──unet.pth
  │  ├──stable-video-diffusion-img2vid-xt
  │  │  ├──...
  │  ├──whisper-tiny
  │  │  ├──...
  │  ├──RIFE
  │  │  ├──flownet.pkl
  │  ├──yoloface_v5m.pt
  ├──...

Download by huggingface-cli follow

python3 -m pip install "huggingface_hub[cli]"

huggingface-cli download EEEELY/DICE-Talk --local-dir  checkpoints
huggingface-cli download stabilityai/stable-video-diffusion-img2vid-xt --local-dir  checkpoints/stable-video-diffusion-img2vid-xt
huggingface-cli download openai/whisper-tiny --local-dir checkpoints/whisper-tiny

or manully download pretrain model, svd-xt and whisper-tiny to checkpoints/.

Run demo

python3 demo.py --image_path '/path/to/input_image' --audio_path '/path/to/input_audio'\ 
  --emotion_path '/path/to/input_emotion' --output_path '/path/to/output_video'

Run GUI

python3 gradio_app.py

On the left you need to:

Upload an image or take a photo
Upload or record an audio clip
Select the type of emotion to generate
Set the strength for identity preservation and emotion generation
Choose whether to crop the input image

On the right are the generated videos.

🔗 Citation

If you find our work helpful for your research, please consider citing our work.

@article{tan2025dicetalk,
  title={Disentangle Identity, Cooperate Emotion: Correlation-Aware Emotional Talking Portrait Generation}, 
  author={Tan, Weipeng and Lin, Chuming and Xu, Chengming and Xu, FeiFan and Hu, Xiaobin and Ji, Xiaozhong and Zhu, Junwei and Wang, Chengjie and Fu, Yanwei},
  journal={arXiv preprint arXiv:2504.18087},
  year={2025}
}

@article{ji2024sonic,
  title={Sonic: Shifting Focus to Global Audio Perception in Portrait Animation},
  author={Ji, Xiaozhong and Hu, Xiaobin and Xu, Zhihong and Zhu, Junwei and Lin, Chuming and He, Qingdong and Zhang, Jiangning and Luo, Donghao and Chen, Yi and Lin, Qin and others},
  journal={arXiv preprint arXiv:2411.16331},
  year={2024}
}

@article{ji2024realtalk,
  title={Realtalk: Real-time and realistic audio-driven face generation with 3d facial prior-guided identity alignment network},
  author={Ji, Xiaozhong and Lin, Chuming and Ding, Zhonggan and Tai, Ying and Zhu, Junwei and Hu, Xiaobin and Luo, Donghao and Ge, Yanhao and Wang, Chengjie},
  journal={arXiv preprint arXiv:2406.18284},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
config/inference		config/inference
examples		examples
src		src
LICENSE		LICENSE
README.md		README.md
demo.py		demo.py
demo.sh		demo.sh
dice_talk.py		dice_talk.py
gradio_app.py		gradio_app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DICE-Talk

🔥🔥🔥 NEWS

🎥 Demo

📜 Requirements

🔑 Inference

Installtion

Run demo

Run GUI

🔗 Citation

About

Uh oh!

Releases

Packages

Contributors 2

Languages

License

toto222/DICE-Talk

Folders and files

Latest commit

History

Repository files navigation

DICE-Talk

🔥🔥🔥 NEWS

🎥 Demo

📜 Requirements

🔑 Inference

Installtion

Run demo

Run GUI

🔗 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages