8000 IsKatakana / IsHiragana being too permissive? · Issue #35 · gojp/kana · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
IsKatakana / IsHiragana being too permissive? #35
Open
@richardgarnier

Description

@richardgarnier

While working on the half-width support, I noticed that IsKatakana (as well as IsHiragana) are based on golang utf8 tables, and the range being used a probably too wide for what a japanese speaker would consider being a katakana or not.

For example, this was slightly unexpected:

  • IsKatakana("ㇰ") = true // attention ㇰ != ク and ㇰ != ク
  • IsKatakana("ウカ") = true

This even more:

  • IsKatakana("㋾") = true

And I would say this is wrong:

  • IsKatakana("㍓") = true

IsHiragana as fewer kirks, but still funny yet unexpected behavior. For example:

  • IsHiragana("🈀") = true
  • IsHiragana("𛁟") = true // \u1b05f

For IsKanji, I also suspect the range to be wider than what would make sense to a human reader, but considering the difficulty to put any kind of boundary to it, I've skipped checking.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0