LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models

Accpeted to ICCV 2025, but arxived on March 2024

Yuzhang Shang*, Mu Cai*, Bingxin Xu, Yong Jae Lee^, Yan Yan^

*Equal Contribution, ^Equal Advising

[Paper] [Project Page]

How to run

Step.0: Set the environment the same as LLaVA-1.5

Note that the core of our proposed module is here in the CLIP image encoder.

Step.1 (for inference): Download Checkpoints

Download the checkpoints (LoRA Version) from Yuzhang's Huggingface Homepage to checkpoints/llava-v1.5-7b-lora-prunemerge.

Step.2 (for inference): Change the methods (PruMerge or PruMerge+).

Change the call function of token reduction from here in the CLIP image encoder.

Step.3 (for inference): Run the script.

For example, the evaluation for TextVQA is:

CUDA_VISIBLE_DEVICES=7 XDG_CACHE_HOME='/data/shangyuzhang/' bash scripts/v1_5/eval/testvqa.sh

For other inference scripts, refer to LLaVA Evaluation.

Reference

If you find our code useful for your research, please cite our paper.

@inproceedings{
shang2025prumerge,
title={LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models},
author={Yuzhang Shang and Mu Cai and Bingxin Xu and Yong Jae Lee and Yan Yan},
booktitle={ICCV},
year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.devcontainer		.devcontainer
docs		docs
images		images
llava		llava
playground/data		playground/data
scripts		scripts
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
cog.yaml		cog.yaml
predict.py		predict.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models

Accpeted to ICCV 2025, but arxived on March 2024

How to run

Step.0: Set the environment the same as LLaVA-1.5

Step.1 (for inference): Download Checkpoints

Step.2 (for inference): Change the methods (PruMerge or PruMerge+).

Step.3 (for inference): Run the script.

Reference

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

42Shawn/LLaVA-PruMerge

Folders and files

Latest commit

History

Repository files navigation

LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models

Accpeted to ICCV 2025, but arxived on March 2024

How to run

Step.0: Set the environment the same as LLaVA-1.5

Step.1 (for inference): Download Checkpoints

Step.2 (for inference): Change the methods (PruMerge or PruMerge+).

Step.3 (for inference): Run the script.

Reference

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages