8000 GitHub - Ammonknows/ShowUI: Repository for ShowUI: One Vision-Language-Action Model for GUI Visual Agent
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Ammonknows/ShowUI

 
 

Repository files navigation

ShowUI

ShowUI

🤗 Hugging Models   |    📑 Paper    |    🤗 Spaces Demo   
🤗 Datasets   |   💬 X (Twitter)   |    🖥️ Computer Use    |    📖 GUI Paper List   

ShowUI: One Vision-Language-Action Model for GUI Visual Agent
Kevin Qinghong Lin, Linjie Li, Difei Gao, Zhengyuan Yang, Shiwei Wu, Zechen Bai, Weixian Lei, Lijuan Wang, Mike Zheng Shou
Show Lab @ National University of Singapore, Microsoft

🔥 Update

🖥️ Computer Use

See Computer Use OOTB for using ShowUI to control your PC.

computer_use_with_showui-en-s.mp4

⭐ Quick Start

See Quick Start for model usage.

🤗 Local Gradio

See Gradio for installation.

BibTeX

If you find our work helpful, please consider citing our paper.

@misc{lin2024showui,
      title={ShowUI: One Vision-Language-Action Model for GUI Visual Agent}, 
      author={Kevin Qinghong Lin and Linjie Li and Difei Gao and Zhengyuan Yang and Shiwei Wu and Zechen Bai and Weixian Lei and Lijuan Wang and Mike Zheng Shou},
      year={2024},
      eprint={2411.17465},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2411.17465}, 
}

About

Repository for ShowUI: One Vision-Language-Action Model for GUI Visual Agent

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%
0