A flexible Text-to-Speech agent built with PocketFlow
This project implements a Text-to-Speech (TTS) agent using the PocketFlow framework. It allows users to convert text to speech with different voice options and save/play the generated audio.
- Convert text to speech with multiple voice options
- Save generated audio to file
- Play audio directly from the application
- Extensible node-based architecture
# Clone the repository
git clone https://github.com/aixiasang/ai-agent-tts.git
cd ai-agent-tts
# Install dependencies
pip install -r requirements.txt
python main.py
Follow the interactive prompts to:
- Enter the text you want to convert to speech
- Select a voice option
- Generate and save the audio
- Play the generated audio
You can extend this framework by:
- Adding new voice options in
utils/tts_engine.py
- Creating new nodes in
nodes.py
- Modifying the flow in
flow.py
MIT