Open
Description
Compare these two commands:
echo viessogirji | hfst-tokenize --giella-cg --weight-classes=1 tokeniser-disamb-gt-desc.pmhfst
"<viessogirji>"
"viessogirji" ?
:\n
echo viessogirji | hfst-tokenize --giella-cg tokeniser-disamb-gt-desc.pmhfst
"<viessogirji>"
"girji" N Sem/Txt Sg Nom <W:10.0>
"viessu" N Sem/Build Cmp/SgNom Cmp <W:10.0>
:\n
It is not restricted to the --giella-cg
mode:
echo viessogirji | hfst-tokenize -x --weight-classes=1 tokeniser-disamb-gt-desc.pmhfst
viessogirji viessogirji ??
viessogirji viessogirji ??
viessogirji viessogirji ??
viessogirji viessogirji ??
viessogirji viessogirji ??
viessogirji viessogirji ??
viessogirji viessogirji ??
viessogirji viessogirji ??
echo viessogirji | hfst-tokenize -c --weight-classes=1 tokeniser-disamb-gt-desc.pmhfst
"<viessogirji>"
"viessogirji" ??
"viessogirji" ??
"viessogirji" ??
"viessogirji" ??
"viessogirji" ??
"viessogirji" ??
"viessogirji" ??
"viessogirji" ??
although the output in CG mode is a bit strange — why would all the unknown analyses be printed if there is a known analysis (with a non-zero weight)?
echo viessogirji | hfst-tokenize -c tokeniser-disamb-gt-desc.pmhfst
"<viessogirji>"
"viessogirji" ??
"viessogirji" ??
"viessogirji" ??
"viessogirji" ??
"viessogirji" ??
"viessogirji" ??
"viessogirji" ??
"viessogirji" ??
viessu N Sem/Build Cmp/SgNom Cmp#girji N Sem/Txt Sg Nom
Hfst tools from Tino's nightly package from November 8, 2021. macOS 11.6.2.
Tokeniser fst is too big to be included, but can be found here for a limited time.