Word Pronunciation Assistant

A fast, local pronunciation and speech synthesis assistant powered by piper-phonemize, optimized for language learners and anyone looking to improve their spoken fluency across dozens of languages. This GUI application supports IPA phoneme visualization and real-time audio playback using neural voice models.

Features

Text-to-IPA phonemization using eSpeak-ng
Real-time audio synthesis with neural voice models (ONNX format)
Adjustable length scale (speaking speed) and volume
Multi-language support with auto-discovered voices
Simple and responsive GUI built with wxWidgets
Runs fully offline on Windows and Linux

Quick Demo

Download a voice model (see Voices below)
Place both .onnx and .onnx.json files into the models/ directory
Run the app, choose a model, type a phrase, and click Pronounce

Example

Welcome to the world of speech synthesis!
→ wˈɛlkʌm tə ðə wˈɜːld ʌv spˈiːtʃ sˈɪnθəsˌɪs

Then listen to the synthesized audio with realistic pronunciation.

Voices

This application uses the same voices as the Piper TTS project, trained with VITS and exported for ONNX Runtime.

Supported Languages

Arabic (ar_JO)
Catalan (ca_ES)
Czech (cs_CZ)
Welsh (cy_GB)
Danish (da_DK)
German (de_DE)
Greek (el_GR)
English (en_GB, en_US)
Spanish (es_ES, es_MX)
Finnish (fi_FI)
French (fr_FR)
Hungarian (hu_HU)
Icelandic (is_IS)
Italian (it_IT)
Georgian (ka_GE)
Kazakh (kk_KZ)
Luxembourgish (lb_LU)
Nepali (ne_NP)
Dutch (nl_BE, nl_NL)
Norwegian (no_NO)
Polish (pl_PL)
Portuguese (pt_BR, pt_PT)
Romanian (ro_RO)
Russian (ru_RU)
Serbian (sr_RS)
Swedish (sv_SE)
Swahili (sw_CD)
Turkish (tr_TR)
Ukrainian (uk_UA)
Vietnamese (vi_VN)
Chinese (zh_CN)

📁 Each voice requires:

A .onnx file (neural model)

A corresponding .onnx.json configuration file

You can download voices from the Piper Voices repository or see the VOICES.md list.

⚠️ Note: Some voices may have restrictive licenses. Always check the MODEL_CARD file before use.

Installation

Binary Releases

Download prebuilt releases from the Releases page, or build from source:

Build From Source (Linux/Windows)

Clone this repo
Run cmake first, and then make or open in your IDE of choice

Your voice models should be placed in a models/ folder next to the binary.

Usage

After launching the app:

Select a language model from the dropdown
Enter any word or phrase in the text box
Click Look up to see the IPA symbols
Click Pronounce to hear the voice synthesis
Adjust speed and volume using the sliders

All synthesis and audio playback happen locally and offline.

Technical Stack

GUI: wxWidgets
TTS Engine: Piper (ONNX Runtime, VITS)
Phonemizer: espeak-ng via piper-phonemize
Audio Output: PortAudio
Language Support: Multilingual, via ONNX+JSON model pairs

License

This app is open source and intended for educational, research, and personal language learning use. See LICENSE for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
deps		deps
icons		icons
src		src
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE.md		LICENSE.md
README.md		README.md
VOICES.md		VOICES.md
scr1.jpg		scr1.jpg
scr2.jpg		scr2.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Word Pronunciation Assistant

Features

Quick Demo

Example

Voices

Supported Languages

Installation

Binary Releases

Build From Source (Linux/Windows)

Usage

Technical Stack

License

Acknowledgements

About

Uh oh!

Releases 1

Packages

Uh oh!

Languages

License

nuloperrito/PronunciationHelper

Folders and files

Latest commit

History

Repository files navigation

Word Pronunciation Assistant

Features

Quick Demo

Example

Voices

Supported Languages

Installation

Binary Releases

Build From Source (Linux/Windows)

Usage

Technical Stack

License

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

Packages