This project features a sophisticated voice assistant capable of recognizing specific keywords from audio inputs, capturing commands post-keyword detection, transcribing audio into text, and crafting responses based on the processed text using OpenAI's advanced models.
- Keyword Detection: Continuously monitors for a predefined keyword within the audio input.
- Command Recording: Initiates recording a command following keyword recognition, halting if silence persists for over 3 seconds.
- Audio Transcription: Translates the main audio and command inputs into textual data.
- Response Generation: Utilizes transcribed texts to generate contextually relevant responses through OpenAI's models.
SpotKeyword
: Employs the Vosk library to detect keywords in audio streams effectively.OpenAITranscriber
: Transforms audio recordings into textual content using OpenAI's transcription capabilities.OpenAIAssistant
: Produces responses based on the transcribed texts and commands.AudioRecorder
: Facilitates audio recording operations, including start, stop, and save functionalities.CommandRecorder
: A specialized recorder for command inputs, incorporating silence detection for automatic recording cessation.
Ensure Python 3.6 or newer is installed on your system. Then, install the required Python packages using pip:
pip install vosk pyaudio openai
Download the appropriate Vosk model for your language from the Vosk Models page. Extract the model to a known location on your filesystem.
Obtain an API key from OpenAI by creating an account and following their API access guidelines.
To use the assistant, configure the initialization parameters in your script:
python from Assistant import Assistant
if __name__ == "__main__":
assistant = Assistant(
vosk_model_path="<path_to_vosk_model>",
keyword="assist me",
openai_key="<your_openai_api_key>"
)
assistant.start()