OpenAI Realtime API Client

Realtime AI Assistant with WebRTC

A realtime voice assistant leveraging WebRTC for low-latency communication with OpenAI's realtime API. The system features:

🔁 Bidirectional audio streaming using WebRTC media channels
💬 Real-time transcript display with chat-style bubbles
🔐 Ephemeral key authentication for secure sessions
🎚️ Interactive UI controls for audio/video management

AI-Assisted Development
This entire repository was generated through iterative prompting workflows using Cursor with OpenAI's o3-mini model in an initial 4-hour session (including this README.md!). The implementation demonstrates practical application of AI pair-programming for complex real-time systems development. See AI-Assisted Development Flow, where the green-outlined prompts indicate the refactoring from WebSockets to WebRTC for the current browser implementation.

A multi-modal client implementation for OpenAI's Realtime API with voice/text interactions via WebRTC and WebSockets.

Features

Core Capabilities

🎙️ Real-time voice conversations with GPT-4o models
📡 Dual protocol support (WebRTC & WebSockets)
⚡ Low-latency audio processing (16-bit PCM)
🔄 Bi-directional event handling
🔒 Ephemeral key rotation
🎯 Voice Activity Detection (VAD) with configurable thresholds
🔄 Session lifecycle management (create/update/terminate)

Modality Support

Real-time audio transcriptions
Text generation with delta updates
Concurrent multi-modal interactions
Custom conversation context management
Speech recognition integration
Function calling support
Audio input/output device management

Appview

Installation

Prerequisites

Python 3.11+
Poetry
OpenAI API key
PortAudio development files (Ubuntu/Debian: sudo apt install portaudio19-dev python3-dev)

# Clone repository
git clone https://github.com/dswinscoe/realtimeAI.git
cd realtimeAI/realtime_client

# Install project dependencies
poetry install

# Configure environment
cp .env.example .env
nano .env  # Add your OpenAI API key

Usage

# Start FastAPI server (development mode)
poetry run uvicorn app.server:app --reload --port 9090

# Run Python client
poetry run python app/client.py

# Access web client at: http://localhost:9090

# VS Code Launch Realtime Server
# For VS Code / Cursor users, a launch configuration is included. Simply open the project in VS Code, go to the Debug view, and select the 'Launch Realtime Server' configuration to start the server effortlessly.

Architecture

Component	Description
`/app/server.py`	FastAPI endpoint for ephemeral keys
`/app/client.py`	Python WebRTC implementation
`/static/client.js`	Browser WebRTC client with speech recognition
`pyproject.toml`	Dependency configuration

Development

# Run tests
poetry run pytest

# Format code
poetry run black .

# Lint checks
poetry run flake8

Documentation

Note: The /realtime_client/docs directory contains unmodified markdown files from OpenAI's public documentation, which were used as a context for the project's GenAI-assisted implementation. These docs provided essential insights into the API's capabilities, and were the foundation for the above AI-assisted development workflow with Cursor and the o3-mini model.

License

MIT Licensed. See LICENSE for details.

Cost Considerations ⚠️

Important Usage Warning
OpenAI's Realtime API has a complex pricing structure that combines text and audio token costs. Developers should carefully monitor usage due to potentially high expenses:

Pricing Overview (GPT-4o)

🎙️ Audio Input: $100/million tokens (~$0.06/min)
🔊 Audio Output: $200/million tokens (~$0.24/min)
📝 Text Input: $5/million tokens
📄 Text Output: $20/million tokens

Real-World Cost Examples

5-minute voice conversation ≈ $5.38
10-minute conversation ≈ $10
Heavy testing can exceed $200/day

Development Recommendations

Implement strict usage monitoring
Set budget alerts in OpenAI dashboard
Use test mode for initial development
Consider cost/performance tradeoffs carefully

"Pricing would need to decrease 10x for viable commercial implementation" - Developer Community Feedback

References
OpenAI Community Discussion · Cost Analysis Article · ZDNet Coverage

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.vscode		.vscode
realtime_client		realtime_client
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

OpenAI Realtime API Client

Realtime AI Assistant with WebRTC

Features

Appview

Installation

Prerequisites

Usage

Architecture

Development

Documentation

License

Cost Considerations ⚠️

About

Uh oh!

Releases

Packages

Languages

dswinscoe/realtimeAI

Folders and files

Latest commit

History

Repository files navigation

OpenAI Realtime API Client

Realtime AI Assistant with WebRTC

Features

Appview

Installation

Prerequisites

Usage

Architecture

Development

Documentation

License

Cost Considerations ⚠️

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages