FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models

Jintao Tong¹, Wenwei Jin², Pengda Qin², Anqi Li³, Yixiong Zou^1✉,
Yuhong Li^2✉, Yuhua Li¹, Ruixuan Li¹

¹School of Computer Science and Technology, Huazhong University of Science and Technology
²Xiaohongshu Inc., ³Institute of Information Science, Beijing Jiaotong University

🔥 News

2025.05.29 🤗 The checkpoints of llava-v1.5-7b-flowcut128 and llava-v1.5-7b-flowcut192, retaining 128 and 192 visual tokens respectively, have been released!
2025.05.28 🚀 Code is available, and FlowCut can be easily installed with pip install flowcut！
2025.05.26 📝 We release our latest work FlowCut, a plug-and-play, training-free token reduction method that seamlessly integrates into various VLMs for efficient training and inference.

💡 Highlights

TLDR: To address inefficiency from excessive visual tokens in LVLMs, we propose a unified, bottom-up perspective based on information-flow, revealing dynamic redundancy emergence and introduce FlowCut, making pruning decision aligned with the model's inherent behavior, outperforming all existing approaches.

🛠 Preparation

Our code is easy to use.

Clone the LLaVA's repository.

git clone https://github.com/haotian-liu/LLaVA.git
cd LLaVA

Install the LLaVA's environment.

conda create -n llava python=3.10 -y
conda activate llava
pip install --upgrade pip  
pip install -e .
pip install flash-attn --no-build-isolation

For formal usage, you can install the package from PyPI by running the following command:

pip install flowcut

For development, you can install the package by cloning the repository and running the following command:

git clone https://github.com/TungChintao/FlowCut
cd flowcut
pip install -e .

File organization as follow:

├── LLaVA-main
    ├── flowcut
    ├── llava
    ├── playground
    ├── script

🚀 Quick Start

from llava.model.builder import load_pretrained_model
from llava.mm_utils import get_model_name_from_path
from llava.eval.run_llava import eval_model
from flowcut import flowcut
model_path = "liuhaotian/llava-v1.5-7b"

tokenizer, model, image_processor, context_len = load_pretrained_model(
    model_path=model_path,
    model_base=None,
    model_name=get_model_name_from_path(model_path)
)
## FlowCut retains 64 visual tokens
model = flowcut(model, target_num=64)

📖 Evaluation

The evaluation code follows the structure of LLaVA or Lmms-Eval. After loading the model, simply add two lines as shown below:

## Load LLaVA Model (code from llava.eval.model_vqa_loader)
tokenizer, model, image_processor, context_len = load_pretrained_model(model_path, args.model_base, model_name)
## add FlowCut
from flowcut import flowcut
model = flowcut(model, target_num=64)

Script templetes (please follow the detailed instruction in LLaVA-Evaluation).

bash scripts/v1_5/eval/[Benchmark].sh

Examples:

CUDA_VISIBLE_DEVICES=0 bash scripts/v1_5/eval/mme.sh

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash scripts/v1_5/eval/vqav2.sh

🎯 Training

The training code follows the structure of LLaVA. After loading the model, simply add two lines as shown below:

## Load LLaVA Model (code from llava.train)
code of loading model...
## add FlowCut
from flowcut import flowcut
model = flowcut(model, target_num=64)
## training
trainer = LLaVATrainer(model=model,
                tokenizer=tokenizer,
                args=training_args,
                **data_module)

🔑 License

This project is released under the Apache 2.0 license.

📌 Citation

If you find this project useful in your research, please consider citing:

@article{tong2025flowcut,
  title={FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models},
  author={Tong, Jintao and Jin, Wenwei and Qin, Pengda and Li, Anqi and Zou, Yixiong and Li, Yuhong and Li, Yuhua and Li, Ruixuan},
  journal={arXiv preprint arXiv:2505.19536},
  year={2025}
}

👍 Acknowledgment

This work is built upon LLaVA, Qwen VL, and Video-LLaVA. We thank them for their excellent open-source contributions.
We also thank FastV, SparseVLM, VisionZip and others for their contributions, which have provided valuable insights.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
flowcut		flowcut
images		images
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models

Jintao Tong¹, Wenwei Jin², Pengda Qin², Anqi Li³, Yixiong Zou^1✉,
Yuhong Li^2✉, Yuhua Li¹, Ruixuan Li¹

¹School of Computer Science and Technology, Huazhong University of Science and Technology
²Xiaohongshu Inc., ³Institute of Information Science, Beijing Jiaotong University

🔥 News

💡 Highlights

🛠 Preparation

🚀 Quick Start

📖 Evaluation

🎯 Training

🔑 License

📌 Citation

👍 Acknowledgment

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

TungChintao/FlowCut

Folders and files

Latest commit

History

Repository files navigation

FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models

Jintao Tong1, Wenwei Jin2, Pengda Qin2, Anqi Li3, Yixiong Zou1✉, Yuhong Li2✉, Yuhua Li1, Ruixuan Li1 1School of Computer Science and Technology, Huazhong University of Science and Technology 2Xiaohongshu Inc., 3Institute of Information Science, Beijing Jiaotong University

🔥 News

💡 Highlights

🛠 Preparation

🚀 Quick Start

📖 Evaluation

🎯 Training

🔑 License

📌 Citation

👍 Acknowledgment

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Jintao Tong¹, Wenwei Jin², Pengda Qin², Anqi Li³, Yixiong Zou^1✉,
Yuhong Li^2✉, Yuhua Li¹, Ruixuan Li¹

¹School of Computer Science and Technology, Huazhong University of Science and Technology
²Xiaohongshu Inc., ³Institute of Information Science, Beijing Jiaotong University

Packages