English | Tiếng Việt [IJCAI 2025 Accepted Paper Preprint]
This project aims to analyze the Vietnamese language to develop a faster typing method by implementing word prediction based on partial input. For instance, inputting only x0ch2
should yield xin chào
as the predicted output.
Completeness: v7
is basically better VNI, everything VNI can do, v7
also can do. So you can input any possible Vietnamese words with v7
.
Use the below script to try v7
method!
- The Vietnamese language consists of many diacritics, making typing in Vietnamese time-consuming due to the need for these diacritical marks.
v7
aims to simplify Vietnamese typing by using only the initial consonant and tone to predict the intended words. For example, instead of typingtưởng tượng
astuong73 tuong75
(VNI
) ortuongwr tuongwj
(Telex
), you can typet3t5
withv7
!- Naturally, this reduction in key usage leads to some information loss. For instance, the input
t3t5
could also correspond totiểu tiện
, as3
represents the hook tonehỏi
and5
represents the underdot tonenặng
. - This project analyzes and addresses these problems to ultimately introduce
v7
, enhancing the Vietnamese typing experience.
v7
inherits both from former VNI and Telex.
-
Special consonants:
g
for bothg
andgh
.ng
for bothng
andngh
.z
forgi
. (z6
→giúp
,giết
,giáp
, ...)dd
forđ
. (dd4
→đã
,đãi
,đỗ
, ...) (Telex style
)
-
Tones (
VNI style
):0
for no tones:tuân
,câm
,tân
...1
for normal acute:cấm
,tiếng
,tấn
,thính
... (compare with6
to see the differences)2
for grave:tuần
,cầm
,tần
...3
for hook:tẩn
,cẩm
,hỉ
...4
for tilde:mãi
,rã
,phũ
...5
for normal underdot:nhậm
,phụng
,độn
,mạnh
... (compare with7
to see the differences)6
forentering/checked
acute:cấp
,tiếc
,tất
,thích
... (everything with acute and ends withp
,t
,c
,ch
must be tone6
)7
forentering/checked
underdot:nhập
,phục
,đột
,mạch
... (everything with underdot and ends withp
,t
,c
,ch
must be tone7
)
-
Special vowels:
- Lots of
ă
,â
,ê
,ô
,ơ
,ư
when typing Vietnamese? Not a problem anymore because just typinga
,e
,o
,u
andv7
will predict the most suitable ones for you! This feature also helps reducing number of keys you have to type!
- Lots of
This 8-tone system follows the Vietnamese Eight-Tone Analysis.
Note: If you aren't familiar with 8-tone system, you can still config to use traditional VNI 6-tone. But using 8-tone system is highly recommended for much much better AI result!
Operating Systems:
- ✅ macOS - Please switch to English keyboard
- ✅ Windows - Please switch to English keyboard
- ⛔ Linux – Not supported yet
Current Limitations:
- 🚫 CapsLock: Not currently supported. Please make sure CapsLock is off when typing.
⚠️ Accessibility: Some platforms (e.g., macOS) may require enabling accessibility permissions such as input monitoring for the tool to function correctly.- Stable version in progress...
v7
predicts the words/phrases users want to type by checking and ranking possible words/phrases.
This mode utilize v7gpt
: a GPT-like model with a custom tokenizer only for v7
, trained on a Vietnamese corpus, based on Andrej Karpathy's nanoGPT.
- Advantages:
- Works in any circumstances.
- Understands the context in which the user is writing to predict the most suitable next word.
- Can effectively predict entire sentences at a time.
This project uses Python 3.12.
To run the app in AI Mode, follow these steps:
- Install the required packages for AI Mode (Torch is required):
pip install -r requirements_ai.txt
- Download the pretrained model checkpoint:
gdown 1dDP0jIJ79syE6vt6QnVl05_4fYpuwrqd -O checkpoints/v7gpt-1.3.pth # Or download the file at https://drive.google.com/file/d/12ZBG5IBOKmgmv7mh32uFdDUqr-K0SzPS/view?usp=drive_link to checkpoints/v7gpt-1.3.pth
- Start the application:
python main.py