A Model Context Protocol (MCP) server that enables voice interactions with Claude and other LLMs. Requires only an OpenAI API key and microphone/speakers.
Runs on: Linux β’ macOS β’ Windows (WSL) | Python: 3.10+ | Tested: Ubuntu 24.04 LTS, Fedora 42
- ποΈ Voice conversations with Claude - ask questions and hear responses
- π Multiple transports - local microphone or LiveKit room-based communication
- π£οΈ OpenAI-compatible - works with any STT/TTS service (local or cloud)
- β‘ Real-time - low-latency voice interactions with automatic transport selection
- π§ MCP Integration - seamless with Claude Desktop and other MCP clients
All you need to get started:
- π OpenAI API Key (or compatible service) - for speech-to-text and text-to-speech
- π€ Computer with microphone and speakers OR βοΈ LiveKit server (LiveKit Cloud or self-hosted)
Setup for Claude Code:
export OPENAI_API_KEY=your-openai-key
claude mcp add voice-mcp uvx voice-mcp
claude
Try: "Let's have a voice conversation"
Watch voice-mcp in action:
Once configured, try these prompts with Claude:
"Let's have a voice conversation"
"Ask me about my day using voice"
"Tell me a joke"
(Claude will speak and wait for your response)"Say goodbye"
(Claude will speak without waiting)
The new converse
function makes voice interactions more natural - it automatically waits for your response by default.
Add to your Claude Desktop configuration file:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
Using uvx (recommended)
{
"mcpServers": {
"voice-mcp": {
"command": "uvx",
"args": ["voice-mcp"],
"env": {
"OPENAI_API_KEY": "your-openai-key"
}
}
}
}
Using pip install
{
"mcpServers": {
"voice-mcp": {
"command": "voice-mcp",
"env": {
"OPENAI_API_KEY": "your-openai-key"
}
}
}
}
Tool | Description | Key Parameters |
---|---|---|
converse |
Have a voice conversation - speak and optionally listen | message , wait_for_response (default: true), listen_duration (default: 10s), transport (auto/local/livekit) |
listen_for_speech |
Listen for speech and convert to text | duration (default: 5s) |
check_room_status |
Check LiveKit room status and participants | None |
check_audio_devices |
List available audio input/output devices | None |
Note: The converse
tool is the primary interface for voice interactions, combining speaking and listening in a natural flow.
π See docs/configuration.md for complete setup instructions for all MCP hosts
π Ready-to-use config files in config-examples/
The only required configuration is your OpenAI API key:
export OPENAI_API_KEY="your-key"
# Custom STT/TTS services (OpenAI-compatible)
export STT_BASE_URL="http://localhost:2022/v1" # Local Whisper
export TTS_BASE_URL="http://localhost:8880/v1" # Local TTS
export TTS_VOICE="nova" # Voice selection
# LiveKit (for room-based communication)
# See docs/livekit/ for setup guide
export LIVEKIT_URL="wss://your-app.livekit.cloud"
export LIVEKIT_API_KEY="your-api-key"
export LIVEKIT_API_SECRET="your-api-secret"
# Debug mode
export VOICE_MCP_DEBUG="true"
For privacy-focused or offline usage, voice-mcp supports local speech services:
- Whisper.cpp - Local speech-to-text with OpenAI-compatible API
- Kokoro - Local text-to-speech with multiple voice options
These services provide the same API interface as OpenAI, allowing seamless switching between cloud and local processing.
βββββββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββββββ
β Claude/LLM β β LiveKit Server β β Voice Frontend β
β (MCP Client) βββββββΊβ (Optional) βββββββΊβ (Optional) β
βββββββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββββββ
β β
β β
βΌ βΌ
βββββββββββββββββββββββ ββββββββββββββββββββ
β Voice MCP Server β β Audio Services β
β β’ converse β β β’ OpenAI APIs β
β β’ listen_for_speechβββββββΊβ β’ Local Whisper β
β β’ check_room_statusβ β β’ Local TTS β
β β’ check_audio_devices ββββββββββββββββββββ
βββββββββββββββββββββββ
- No microphone access: Check system permissions for terminal/application
- UV not found: Install with
curl -LsSf https://astral.sh/uv/install.sh | sh
- OpenAI API error: Verify your
OPENAI_API_KEY
is set correctly - No audio output: Check system audio settings and available devices
Enable detailed logging and audio file saving:
export VOICE_MCP_DEBUG=true
Debug audio files are saved to: ~/voice-mcp_recordings/
MIT