Build voice-enabled AI assistants using Azure OpenAI's Realtime API. Create a multi-agent system for customer service applications.
Module | Focus | Documentation |
---|---|---|
1. WebSocket Basics | Real-time Communication Fundamentals | Guide |
2. Function Calling | Azure OpenAI Realtime API Integration | Guide |
3. Multi-Agent System | Customer Service Implementation | Guide |
4. Voice RAG | Voice-Optimized Document Retrieval | Guide |
- Execute
azd up
from the root folder. - The above command will setup your python env, provision Azure AI foundry Hub, Project and GPT4o-realtime-audio instance and will initialize .env file.
- Update the .env with the API Key of GPT4o-realtime-audio model. You can find the key in Azure portal
- Move to respective modules to further run/work on the workshop modules.
The following SDKs and libraries can be used to integrate with the gpt-4o-realtime-api (preview) on Azure.
SDK/Library | Description |
---|---|
openai-python |
The official Python library for the (Azure) OpenAI API |
openai-dotnet |
The official .NET library for the (Azure) OpenAI API |
Accelerator | Description |
---|---|
VoiceRAG (aisearch-openai-rag-audio) | A simple example implementation of the VoiceRAG pattern to power interactive voice generative AI experiences using RAG with Azure AI Search and Azure OpenAI's gpt-4o-realtime-preview model. |
On The Road CoPilot | A minimal speech-to-structured output app built with Azure OpenAI Realtime API. |
Contributions welcome via pull requests.