Open
Description
While working on the half-width support, I noticed that IsKatakana (as well as IsHiragana) are based on golang utf8 tables, and the range being used a probably too wide for what a japanese speaker would consider being a katakana or not.
For example, this was slightly unexpected:
- IsKatakana("ㇰ") = true // attention ㇰ != ク and ㇰ != ク
- IsKatakana("ウカ") = true
This even more:
- IsKatakana("㋾") = true
And I would say this is wrong:
- IsKatakana("㍓") = true
IsHiragana as fewer kirks, but still funny yet unexpected behavior. For example:
- IsHiragana("🈀") = true
- IsHiragana("𛁟") = true // \u1b05f
For IsKanji, I also suspect the range to be wider than what would make sense to a human reader, but considering the difficulty to put any kind of boundary to it, I've skipped checking.
Metadata
Metadata
Assignees
Labels
No labels