8000 GitHub - faith-luo/follyanna: automatic pinyin and furigana ruby tags in js
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

faith-luo/follyanna

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

follyanna

automatic pinyin and furigana ruby tags in js

Overview

follyanna uses intelligent furigana placement to strip redundant kana from furigana readings. It also supports pinyin rendering.

For instance:

  • 頑張る[がんばる] -> 頑張がんば
  • 加油[jia1 you2] -> jiāyóu

This is helpful because a lot of dictionary data, such as Jmdict, contains readings in fully-rendered but not per-character form. These are often a bit clunky to read on their own:

  • 頑張るがんばる
  • 阿吽の呼吸あうんのこきゅう

This is similar to react-furi, but I wanted a version that could work in an Anki deck or in other plain JS environments without including React.

Usage

  • Include furigana.bundle.js into your webpage.

Demo

頑張る[がんばる]
頑張がんば

阿吽の呼吸[あうんのこきゅう]
阿吽あうん 呼吸こきゅう

冴え冴え[さえざえ]

権兵衛が種蒔きゃ烏がほじくる[ごんべえがたねまきゃからすがほじくる]
権兵衛ごんべえ 種蒔たねまきゃ からすがほじくる

蒔かぬ種は生えぬ[まかぬたねははえぬ]
かぬ たね えぬ

秋の野芥子[あきののげし]
あき 野芥子のげし

巻き脚絆[まききゃはん]
脚絆きゃはん

How it works

We use a greedy algorithm to match each kana sequence to a set of candidate positions. If the candidate positions are at the start or tail of the string (as in 頑張る), we know for certainty that we can match them. If, after matching start and tail sequences, there is only one candidate in the middle of the string, we can definitely match it.

The only ambiguous cases are when there are at least two candidates in the middle of the string. Right now we are not able to solve these, and the tokenizer will just use the whole string in this case. This is solvable using levenshtein distance or public datasets and may be added in a future revision.

It's not particularly hard to do pinyin rendering; it's just bundled into the same library as a utility for multi-language environments.

TODO

  • Set up CDN
  • Deal with ambiguities when there are two or more mid-string candidates

About

automatic pinyin and furigana ruby tags in js

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0