Stars
💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.
Mobile-Agent: The Powerful Mobile Device Operation Assistant Family
LLM-Powered GUI Agents in Phone Automation: Surveying Progress and Prospects
Control your Android devices with AI using Model Context Protocol
0.5.3 experimental emulator written in Python.
Trigger events and automate shows in response to events on Pioneer CDJs
Exploring ways to participate in a Pioneer Pro DJ Link network
Code for our ACL 2023 Paper "Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models".
Source code for the paper "Empowering LLM to use Smartphone for Intelligent Task Automation"
A lightweight test input generator for Android. Similar to Monkey, but with more intelligence and cool features!
Open-Source Chrome extension for AI-powered web automation. Run multi-agent workflows using your own LLM API key. Alternative to OpenAI Operator.
Enhanced ChatGPT Clone: Features Agents, DeepSeek, Anthropic, AWS, OpenAI, Assistants API, Azure, Groq, o1, GPT-4o, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message se…
5ire is a cross-platform desktop AI assistant, MCP client. It compatible with major service providers, supports local knowledge base and tools via model context protocol servers .
Android app for accessing LibreChat Instance
The easiest way to discover and install MCPs
Deploy serverless AI workflows at scale. Firebase for AI agents
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
Code for running a volumetric display made out of WS2812 LED curtains
[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.
The Complete Windows Web Developer Setup Guide
Desktop app powered by Claude’s computer use capability to control your computer
Browser automation system that uses AI-driven planning to navigate web pages and perform goals.
Building a comprehensive and handy list of papers for GUI agents