8000
We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I assume that the output of hfst-tokenize --xerox should be exactly the same as the output of hfst-lookup (minus weights?), but it is not:
hfst-tokenize --xerox
hfst-lookup
$ echo "Саша бларгвит." | hfst-tokenize --xerox tools/tokenisers/tokeniser-disamb-gt-desc.pmhfst Саша Саша N Prop Sem/Ant Msc Anim Sg Nom Саша Саша N Prop Sem/Ant Fem Inan Sg Nom Саша бларгвит . . CLB
$ echo "Саша бларгвит." | hfst-tokenize tools/tokenisers/tokeniser-disamb-gt-desc.pmhfst | hfst-lookup -q src/analyser-gt-desc.hfstol Саша Саша+N+Prop+Sem/Ant+Fem+Inan+Sg+Nom 0.000000 Саша Саша+N+Prop+Sem/Ant+Msc+Anim+Sg+Nom 0.000000 бларгвит бларгвит+? inf . .+CLB 0.000000
There is an extra token after recognized tokens (Саша) with any reading, and there is no reading at all after unrecognized tokens (бларгвит).
Саша
бларгвит
The text was updated successfully, but these errors were encountered:
No branches or pull requests
I assume that the output of
hfst-tokenize --xerox
should be exactly the same as the output ofhfst-lookup
(minus weights?), but it is not:There is an extra token after recognized tokens (
Саша
) with any reading, and there is no reading at all after unrecognized tokens (бларгвит
).The text was updated successfully, but these errors were encountered: