🤗 Hugging Models | 📑 Paper | 🤗 Spaces Demo
🤗 Datasets | 💬 X (Twitter)
| 🖥️ Computer Use
ShowUI: One Vision-Language-Action Model for GUI Visual Agent
Kevin Qinghong Lin, Linjie Li, Difei Gao, Zhengyuan Yang, Shiwei Wu, Zechen Bai, Weixian Lei, Lijuan Wang, Mike Zheng Shou
Show Lab @ National University of Singapore, Microsoft
- [2024.12.9] Support int8 Quantization.
- [2024.12.5] Major Update: ShowUI is integrated into OOTB for local run!
- [2024.12.1] We support iterative refinement to improve grounding accuracy. Try it at HF Spaces demo.
- [2024.11.27] We release the arXiv paper, HF Spaces demo and
ShowUI-desktop-8K
. - [2024.11.16]
showlab/ShowUI-2B
is available at huggingface.
computer_use_with_showui-en-s.mp4
See Quick Start for model usage.
See Gradio for installation.
If you find our work helpful, please consider citing our paper.
@misc{lin2024showui,
title={ShowUI: One Vision-Language-Action Model for GUI Visual Agent},
author={Kevin Qinghong Lin and Linjie Li and Difei Gao and Zhengyuan Yang and Shiwei Wu and Zechen Bai and Weixian Lei and Lijuan Wang and Mike Zheng Shou},
year={2024},
eprint={2411.17465},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2411.17465},
}