Connect Cursor AI to your local Ollama models
OllamaLink is a simple proxy that connects Cursor AI to your local Ollama models. It comes in both GUI and CLI versions, offering flexibility in how you want to manage your Ollama connections.
- Simple: Minimal setup, just run and go
- Private: Your code stays on your machine
- Flexible: Use any Ollama model with Cursor
- Tunnel: Works with Cursor's cloud service via localhost.run tunnels
- GUI & CLI: Choose between graphical or command-line interface
- Model Mapping: Map custom model names to your local Ollama models
- Real-time Monitoring: Track requests and responses in the GUI
- Secure: Optional API key support for added security
- No Timeout Limits: Long-running operations now supported without constraints
-
Install Ollama: Ensure you have Ollama installed and running:
ollama serve
-
Install Requirements: Install the necessary dependencies using pip:
pip install -r requirements.txt
-
Run OllamaLink: Choose your preferred interface:
GUI Version:
python run_gui.py
CLI Version:
python run_cli.py
First, download and install Cursor from the official website: https://cursor.sh
When you start OllamaLink, you'll get two types of URLs:
- Local URL:
http://localhost:8080/v1
(default) - Tunnel URL: A public URL like
https://randomsubdomain.localhost.run/v1
Choose the appropriate URL based on your needs:
- Use the Local URL if running Cursor on the same machine and port forwarding at router
- Use the Tunnel URL if you want to access your models from anywhere (recommended)
-
Open Cursor and access settings:
- On macOS: Press
⌘ + ,
(Command + Comma) - On Windows/Linux: Press
Ctrl + ,
- Or click on the gear icon in the bottom left corner
- On macOS: Press
-
Navigate to the "Models" tab in settings
-
Configure the following settings:
- Find "Override OpenAI Base URL" below point OpenAI API Key
- Paste your OllamaLink URL (either local or tunnel URL) and press save
- Make sure to include the
/v1
at the end of the URL - Past API Key if specified in config.json or let it empty
- Press Verify behind key input and it should automatically detect and start
-
Select a Model: Ensure the mapped models are availiable in cursor.
- If not add them e.g. qwen2.5": "qwen2.5" in config you need to daa qwen2.5 as model in cursor
The actual Ollama model used depends on your
config.json
mappings. -
Test the Connection:
- Click the "Test Connection" button
- You should see a success message
- If you get an error, check the troubleshooting section below
When setting up model mappings in config.json
, please note:
- You cannot use existing commercial model names like "gpt-4o" or "claude-3.5" as the model names in Cursor
-
URL Error:
- Make sure the URL ends with
/v1
- Check if OllamaLink is running
- Try the local URL if tunnel isn't working
- Make sure the URL ends with
-
Model Not Found:
- Ensure you've selected one of the supported model names
- Check your model mappings in
config.json
- Verify your Ollama models with
ollama list
-
Connection Failed:
- Verify Ollama is running with
ollama serve
- Check OllamaLink logs for errors
- Try restarting both OllamaLink and Cursor
- Verify Ollama is running with
-
Tunnel Issues:
- Ensure SSH is installed on your system for localhost.run tunnels
- Check the console logs for any tunnel connection errors
- If you see "permission denied" errors, make sure your SSH setup is correct
- The system will check for existing tunnels to avoid conflicts
You can customize OllamaLink using a config.json
file in the project root:
{
"openai": {
"api_key": "sk-proj-1234567890",
"endpoint": "https://api.openai.com"
},
"ollama": {
"endpoint": "http://localhost:11434",
"model_mappings": {
"gpt-4o": "qwen2",
"gpt-3.5-turbo": "llama3",
"claude-3-opus": "wizardcoder",
"default": "qwen2.5-coder"
},
"thinking_mode": true,
"skip_integrity_check": true,
"max_streaming_tokens": 32000
},
"server": {
"port": 8080,
"hostname": "127.0.0.1"
},
"tunnels": {
"use_tunnel": true,
"preferred": "localhost.run"
}
}
The graphical interface provides:
- Dashboard: View server status and model mappings
- Console: Real-time server logs and events
- Requests/Responses: Monitor API traffic
- Settings: Configure server and model mappings
python run_cli.py [options]
--port PORT
: Port to run on (default from config.json or 8080)--direct
: Direct mode without tunnel--ollama URL
: Ollama API URL (default from config.json or http://localhost:11434)--host HOST
: Host to bind to (default from config.json or 127.0.0.1)--tunnel
: Use localhost.run tunnel (default: on)--no-tunnel
: Disable tunnel
OllamaLink provides flexible model mapping that allows you to route requests for commercial models (like GPT-4 or Claude) to your locally running Ollama models.
-
Direct Mapping: Each entry maps a client model name to a local model pattern.
For example, when a client requests
gpt-4o
, OllamaLink will:- Look for a model that exactly matches "qwen2"
- If not found, look for a model that contains "qwen2" in its name
- If still not found, fall back to the default model
-
Default Model: The
default
entry specifies which model to use when no appropriate mapping is found. -
Fuzzy Matching: The router performs fuzzy matching, so "qwen2" will match models like "qwen2.5-coder:latest", "qwen2-7b-instruct", etc.
Build using py2app:
# GUI Version
python setup.py py2app
# CLI Version
python setup.py py2app --cli
Build using PyInstaller:
# GUI Version
pyinstaller --name OllamaLink-GUI --onefile --windowed --icon=icon.ico --add-data "config.json;." run_gui.py
# CLI Version
pyinstaller --name OllamaLink-CLI --onefile --console --icon=icon.ico --add-data "config.json;." run_cli.py
If OllamaLink starts but shows an error connecting to Ollama:
-
Check if Ollama is Running:
ollama serve
-
Verify API Access: Open your browser to:
http://localhost:11434/api/tags
You should see a JSON response with your models. -
Ensure at Least One Model is Installed:
ollama list
If no models are shown, install one:
ollama pull qwen2.5-coder
-
Connection Issues:
- Check your firewall settings if using a remote Ollama server
- Verify the Ollama endpoint in config.json if you changed it from default
If you encounter this error in Cursor:
We encountered an issue when using your API key: No completion was generated after max retries
API Error: Unknown error
(Request ID: xxxx-xxxx-xxxx-xxxx)
Try these steps:
-
Ensure Ollama Models: Make sure at least one model is loaded in Ollama:
ollama list
If no models are listed, run:
ollama pull qwen2.5-coder
-
Restart OllamaLink:
python run.py
-
Update Cursor Model Selection: In Cursor, make sure you're using mapped model names
- Switched to localhost.run for more reliable connections
- Enhanced tunnel URL detection for various connection scenarios
- Added checks for existing tunnel processes to prevent conflicts
- Improved error handling and logging for tunnel connections
- Added configurable thinking mode to control how models generate responses
- When enabled (default), models perform thorough analysis before responding
- Can be disabled with the
thinking_mode: false
setting for faster, more direct responses - Automatically adds the
/no_think
prefix to user messages when disabled
- Removed all artificial timeout limitations
- Support for long-running operations without time constraints
- Fixed handling of structured message content for compatibility with Ollama's API
- Improved model listing and availability detection
MIT