ScreenMonitorMCP - Revolutionary AI Vision Server

Give AI real-time sight and screen interaction capabilities

ScreenMonitorMCP is a revolutionary MCP (Model Context Protocol) server that provides Claude and other AI assistants with real-time screen monitoring, visual analysis, and intelligent interaction capabilities. This project enables AI to see, understand, and interact with your screen in ways never before possible.

Why ScreenMonitorMCP?

Transform your AI assistant from text-only to a visual powerhouse that can:

Monitor your screen in real-time and detect important changes
Click UI elements using natural language commands
Extract text from any part of your screen
Analyze screenshots and videos with AI
Provide intelligent insights about screen activity

Core Features

Smart Monitoring System

start_smart_monitoring() - Enable intelligent monitoring with configurable triggers
get_monitoring_insights() - AI-powered analysis of screen activity
get_recent_events() - History of detected screen changes
stop_smart_monitoring() - Stop monitoring with preserved insights

Natural Language UI Interaction

smart_click() - Click elements using descriptions like "Save button"
extract_text_from_screen() - OCR text extraction from screen regions
get_active_application() - Get current application context

Visual Analysis Tools

capture_and_analyze() - Screenshot capture with AI analysis
record_and_analyze() - Video recording with AI analysis
query_vision_about_current_view() - Ask AI questions about current screen

🆕 Real-time Screen Streaming

start_screen_stream() - Start real-time base64 screen streaming with performance optimizations
get_stream_frame() - Get the latest frame from an active stream
get_stream_status() - Monitor stream health, performance, and statistics
stop_screen_stream() - Stop streaming and cleanup resources
list_active_streams() - List all active streams with their status

System Performance

get_system_metrics() - Comprehensive system health dashboard
get_cache_stats() - Cache performance statistics
optimize_image() - Advanced image optimization
simulate_input() - Keyboard and mouse simulation

Quick Setup

1. Installation

Installation

Option 1: Install from PyPI (Recommended)

# Install the package
pip install screenmonitormcp

# Run the server
screenmonitormcp
# or use the short alias
smcp

Option 2: Install from Source

git clone https://github.com/inkbytefo/ScreenMonitorMCP.git
cd ScreenMonitorMCP
pip install -e .

Configuration

Create a .env file in your working directory:

# Copy the example configuration
cp .env.example .env
# Edit .env file with your OpenAI API key

Example .env configuration:

OPENAI_API_KEY=your_openai_api_key_here
OPENAI_BASE_URL=https://api.openai.com/v1
DEFAULT_OPENAI_MODEL=gpt-4-vision-preview
DEFAULT_MAX_TOKENS=1000

Claude Desktop Integration

Add to your Claude Desktop claude_desktop_config.json:

{
  "mcpServers": {
    "screenMonitorMCP": {
      "command": "screenmonitormcp",
      "args": []
    }
  }
}

Alternative with custom path:

{
  "mcpServers": {
    "screenMonitorMCP": {
      "command": "python",
      "args": [
        "-m", "screenmonitormcp.main"
      ]
    }
  }
}

Usage Examples

# Start intelligent monitoring
await start_smart_monitoring(triggers=['significant_change', 'error_detected'])

# Natural language UI interaction
await smart_click('Save button')
await smart_click('Email input field')

# Ask AI about current screen
await query_vision_about_current_view('What errors are visible on this page?')

# Extract text from screen
await extract_text_from_screen()

# 🆕 Real-time screen streaming
stream_result = await start_screen_stream(
    fps=5,
    quality=70,
    format="jpeg",
    scale=0.5,
    change_detection=True,
    adaptive_quality=True
)
stream_id = stream_result['stream_id']

# Get latest frame from stream
frame = await get_stream_frame(stream_id)
# frame['frame']['data'] contains base64 encoded image

# Monitor stream performance
status = await get_stream_status(stream_id)
print(f"FPS: {status['stream_info']['stats']['current_fps']}")

# Stop streaming
await stop_screen_stream(stream_id)

Available Tools (26 Total)

Smart Monitoring (6 tools): Real-time screen monitoring with AI analysis UI Interaction (2 tools): Natural language screen control Visual Analysis (3 tools): AI-powered image and video analysis 🆕 Real-time Streaming (5 tools): Base64 screen streaming with performance optimizations System Performance (7 tools): Performance monitoring and optimization Input Simulation (2 tools): Keyboard and mouse automation Utility (1 tool): Tool documentation and listing

Technical Features

21 Revolutionary Tools - Comprehensive AI vision capabilities
Real-time Monitoring - Adaptive FPS with smart triggers
Multi-AI Support - OpenAI, OpenRouter, and custom endpoints
Advanced OCR - Tesseract and EasyOCR integration
Cross-platform - Windows, macOS, Linux support
Smart Caching - Performance optimization
Security Focused - API key management

Vision & Mission

Vision: Enable AI assistants to see and interact with the visual world, breaking down the barrier between text-based AI and real-world interfaces.

Mission: Provide the foundational technology for AI-human visual interaction, making AI assistants truly helpful in visual tasks and screen-based workflows.

Contributing

We welcome contributions to this revolutionary project:

Bug reports and feature requests
Code contributions and improvements
Documentation enhancements

See CONTRIBUTING.md for details.

License

This project is licensed under the MIT License. See LICENSE for details.

Ready to give your AI real sight?

ScreenMonitorMCP transforms AI assistants from text-only to visually intelligent companions.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github/workflows		.github/workflows
screenmonitormcp		screenmonitormcp
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ScreenMonitorMCP - Revolutionary AI Vision Server

Why ScreenMonitorMCP?

Core Features

Smart Monitoring System

Natural Language UI Interaction

Visual Analysis Tools

🆕 Real-time Screen Streaming

System Performance

Quick Setup

1. Installation

Installation

Option 1: Install from PyPI (Recommended)

Option 2: Install from Source

Configuration

Claude Desktop Integration

Usage Examples

Available Tools (26 Total)

Technical Features

Vision & Mission

Contributing

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

inkbytefo/ScreenMonitorMCP

Folders and files

Latest commit

History

Repository files navigation

ScreenMonitorMCP - Revolutionary AI Vision Server

Why ScreenMonitorMCP?

Core Features

Smart Monitoring System

Natural Language UI Interaction

Visual Analysis Tools

🆕 Real-time Screen Streaming

System Performance

Quick Setup

1. Installation

Installation

Option 1: Install from PyPI (Recommended)

Option 2: Install from Source

Configuration

Claude Desktop Integration

Usage Examples

Available Tools (26 Total)

Technical Features

Vision & Mission

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages