Release of code, datasets and model for our work TongUI: Building Generalized GUI Agents by Learning from Multimodal Web Tutorials
agent vision-language-model vision-language-action computer-use gui-agent vision-language-action-model computer-use-agent tongui
-
Updated
Jun 16, 2025 - HTML