Stars
A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.
Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.
AgentCPM-GUI: An on-device GUI agent for operating Android apps, enhancing reasoning ability with reinforcement fine-tuning for efficient task execution.
Examples and guides for using the Gemini API
A simple screen parsing tool towards pure vision based GUI agent
💖🧸 A container of souls of AI waifu / virtual characters to bring them into our worlds, wishing to achieve Neuro-sama's altitude, completely LLM and AI driven, capable of realtime voice chat, Minec…
Pure Javascript OCR for more than 100 Languages 📖🎉🖥
In-depth tutorials on LLMs, RAGs and real-world AI agent applications.
Staging repo for development of native port of TypeScript
一个用于在 macOS 上平滑你的鼠标滚动效果或单独设置滚动方向的小工具, 让你的滚轮爽如触控板 | A lightweight tool used to smooth scrolling and set scroll direction independently for your mouse on macOS
Master programming by recreating your favorite technologies from scratch.
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, Mistral Small 3.1 and other large language models.
🎉Bridge of iOS Devices by usbmuxd. 基于usbmuxd的iOS调试工具。
wrapper for pymobiledevice3 to make it more easy to use.
Facebook WebDriverAgent Python Client Library (not official)
tidevice can be used to communicate with iPhone device
An invoice generator app built using Next.js, Typescript, and Shadcn
User-friendly Desktop Client App for AI Models/LLMs (GPT, Claude, Gemini, Ollama...)
iOS Minicap provides a socket interface for streaming realtime screen capture data out of iOS devices.
Instructions for mirroring iOS device on web browser
🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / DeepSeek / Qwen), Knowledge Base (file upload / knowledge managemen…
📱 Display and control your Android device graphically with scrcpy.
OCR, layout analysis, reading order, table recognition in 90+ languages
📦 Repomix is a powerful tool that packs your entire repository into a single, AI-friendly file. Perfect for when you need to feed your codebase to Large Language Models (LLMs) or other AI tools lik…