-
Notifications
You must be signed in to change notification settings - Fork 105
0.1.8 release #79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
0.1.8 release #79
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Evelina <ebakhturina@nvidia.com>
jimregan
pushed a commit
to jimregan/NeMo-text-processing
that referenced
this pull request
Jun 15, 2023
Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Jim O'Regan <joregan@kth.se>
BuyuanCui
pushed a commit
to BuyuanCui/NeMo-text-processing
that referenced
this pull request
Jul 6, 2023
Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com>
mgrafu
pushed a commit
that referenced
this pull request
Jul 18, 2023
Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com>
BuyuanCui
pushed a commit
that referenced
this pull request
Dec 12, 2023
Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
BuyuanCui
pushed a commit
that referenced
this pull request
Feb 16, 2024
Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
ekmb
added a commit
that referenced
this pull request
Apr 30, 2024
* Swedish telephone fix (#60) * port fix for telephone from swedish-itn branch Signed-off-by: Jim O'Regan <joregan@kth.se> * extend cardinal in non-deterministic mode Signed-off-by: Jim O'Regan <joregan@kth.se> * whitespace fixes Signed-off-by: Jim O'Regan <joregan@kth.se> * also fix in the verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * Update Jenkinsfile Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * log instead of print in graph_utils.py (#68) Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * CER estimation speedup for audio-based text normalization (#73) * Replaced jiwer with editdistance to speed up CER estimation Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measure coverage for TN and ITN (#62) * add measure coverage for TN and ITN Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update measure.py Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63) * upload es-ES and fr-FR g2p dicts Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add inits Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add NALA Spanish dict Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * rename Spanish and French dictionaries Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add Italian dictionary Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add country codes from hu (#77) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix electronic case for username (#75) * fix electronic username w/o . Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix ar test Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci dirs, enable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * 0.1.8 release (#79) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Codeswitched ES/EN ITN (#78) * Initial commit for ES-EN codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable export for es_en codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add whitelist, update weights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add tests for en_es, zone tagged separately in es Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix path to test data for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkinsfile - enable ES/EN tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add __init__.py files Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix issues with failed docker build - due to archiving of debian and issues with re2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove unused imports and variables Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update date Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable NBSP in sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update copyrights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache path in for ES/EN CI/CD Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * minor normalize.py edit for usability (#84) * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * documentation edits for grammar/clarity Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * added --output_field flag for command line interface Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Swedish ITN (#40) * force two digits for month Signed-off-by: Jim O'Regan <joregan@kth.se> * put it in a function, because I reject the garbage pre-commit.ci came up with Signed-off-by: Jim O'Regan <joregan@kth.se> * wrap some more pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * add graph pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * delete junk Signed-off-by: Jim O'Regan <joregan@kth.se> * my copyright Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser (copy from es) Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * add right tokens Signed-off-by: Jim O'Regan <joregan@kth.se> * some tweaks, more needed Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to ITN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * moved to tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * nothing actually fixed here Signed-off-by: Jim O'Regan <joregan@kth.se> * now most tests pass Signed-off-by: Jim O'Regan <joregan@kth.se> * electronic Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fractions Signed-off-by: Jim O'Regan <joregan@kth.se> * extend Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bare fractions is a bit of an overreach Signed-off-by: Jim O'Regan <joregan@kth.se> * whitelist Signed-off-by: Jim O'Regan <joregan@kth.se> * just inverting the TN whitelist tagger will not work/be useful Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from English Signed-off-by: Jim O'Regan <joregan@kth.se> * overwrite with version from en Signed-off-by: Jim O'Regan <joregan@kth.se> * add basic test case Signed-off-by: Jim O'Regan <joregan@kth.se> * fix call Signed-off-by: Jim O'Regan <joregan@kth.se> * swap tsv sides Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add optional_era variable Signed-off-by: Jim O'Regan <joregan@kth.se> * add test case Signed-off-by: Jim O'Regan <joregan@kth.se> * make deterministic default, like most of the others Signed-off-by: Jim O'Regan <joregan@kth.se> * also add lowercase versions Signed-off-by: Jim O'Regan <joregan@kth.se> * replacing NEMO_SPACE does not work either Signed-off-by: Jim O'Regan <joregan@kth.se> * increasing weight... did not work last time Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaking test cases, in case it was a sentence splitting issue. It was not Signed-off-by: Jim O'Regan <joregan@kth.se> * put the full stops back Signed-off-by: Jim O'Regan <joregan@kth.se> * add filler words Signed-off-by: Jim O'Regan <joregan@kth.se> * try splitting this out to see if it makes a difference Signed-off-by: Jim O'Regan <joregan@kth.se> * aha, this part should be non-deterministic only Signed-off-by: Jim O'Regan <joregan@kth.se> * single line only Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "increasing weight... did not work last time" This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996. Signed-off-by: Jim O'Regan <joregan@kth.se> * disabling ITN here makes TN work again(?) Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "disabling ITN here makes TN work again(?)" This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f. Signed-off-by: Jim O'Regan <joregan@kth.se> * changing the variable name fixes norm tests Signed-off-by: Jim O'Regan <joregan@kth.se> * change the variable names Signed-off-by: Jim O'Regan <joregan@kth.se> * add missing test tooling Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * add a piece for area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add country codes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * extend any_read_digit for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * country/area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * first attempt Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * remove country codes for the time being, makes things ambiguous Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove trailing whitespace Signed-off-by: Jim O'Regan <joregan@kth.se> * Update __init__.py Signed-off-by: Jim O’Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transform of TN tests Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transformation of TN decimal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * slight changes to date Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * include space Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen was not that Signed-off-by: Jim O'Regan <joregan@kth.se> * add functions from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * respect my own copyright xD Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading, this has been an oddity before Signed-off-by: Jim O'Regan <joregan@kth.se> * try changing this year declaration Signed-off-by: Jim O'Regan <joregan@kth.se> * add year + era Signed-off-by: Jim O'Regan <joregan@kth.se> * eliminate more module-level data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "eliminate more module-level data loading" This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a. Signed-off-by: Jim O'Regan <joregan@kth.se> * expose variables Signed-off-by: Jim O'Regan <joregan@kth.se> * extra param for itn mode Signed-off-by: Jim O'Regan <joregan@kth.se> * change call Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * fix parens Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapt/extend tests Signed-off-by: Jim O'Regan <joregan@kth.se> * fix dict init/change keys to something useful Signed-off-by: Jim O'Regan <joregan@kth.se> * initial stab at prefixed numbers Signed-off-by: Jim O'Regan <joregan@kth.se> * some adapting Signed-off-by: Jim O'Regan <joregan@kth.se> * insert kl. if absent Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comments Signed-off-by: Jim O'Regan <joregan@kth.se> * the relative prefixed times Signed-off-by: Jim O'Regan <joregan@kth.se> * + comments Signed-off-by: Jim O'Regan <joregan@kth.se> * enable time Signed-off-by: Jim O'Regan <joregan@kth.se> * space in both directions Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix hours to Signed-off-by: Jim O'Regan <joregan@kth.se> * split by before/after Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * fix if Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. 9 Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from en Signed-off-by: Jim O'Regan <joregan@kth.se> * keep only get_abs_path Signed-off-by: Jim O'Regan <joregan@kth.se> * imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add trimmed file Signed-off-by: Jim O'Regan <joregan@kth.se> * fix imports Signed-off-by: Jim O'Regan <joregan@kth.se> * two abs_paths... could be fun Signed-off-by: Jim O'Regan <joregan@kth.se> * minutes/seconds Signed-off-by: Jim O'Regan <joregan@kth.se> * suffix Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * one optional Signed-off-by: Jim O'Regan <joregan@kth.se> * export variable Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. or one of suffix/zone Signed-off-by: Jim O'Regan <joregan@kth.se> * already disambiguated Signed-off-by: Jim O'Regan <joregan@kth.se> * closure Signed-off-by: Jim O'Regan <joregan@kth.se> * do not insert kl. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix test case Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix spelling Signed-off-by: Jim O'Regan <joregan@kth.se> * Delete measure.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Delete money.py Signed-off-by: Jim O’Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused test pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from es Signed-off-by: Jim O'Regan <joregan@kth.se> * add SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add/update __init__ Signed-off-by: Jim O'Regan <joregan@kth.se> * blank line Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix lang Signed-off-by: Jim O'Regan <joregan@kth.se> * fix decimal verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * remove year, conflicts with cardinal Signed-off-by: Jim O'Regan <joregan@kth.se> * space before, not after Signed-off-by: Jim O'Regan <joregan@kth.se> * fix cardinal tests Sign E7F5 ed-off-by: Jim O'Regan <joregan@kth.se> * spurious deletion Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * re-enable SV TN; enable SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "re-enable SV TN; enable SV ITN" This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix singulras Signed-off-by: Jim O'Regan <joregan@kth.se> * add an export Signed-off-by: Jim O'Regan <joregan@kth.se> * change integer graph Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move spaces Signed-off-by: Jim O'Regan <joregan@kth.se> * use cdrewrite Signed-off-by: Jim O'Regan <joregan@kth.se> * just EOS/BOS Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <joregan@kth.se> * omit en/ett, because they are also articles Signed-off-by: Jim O'Regan <joregan@kth.se> * uncomment Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused Signed-off-by: Jim O'Regan <joregan@kth.se> * strip spaces from decimal part Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * partial fix, not what I wanted Signed-off-by: Jim O'Regan <joregan@kth.se> * move comment Signed-off-by: Jim O'Regan <joregan@kth.se> * en/ett cannot work in itn case Signed-off-by: Jim O'Regan <joregan@kth.se> * be more deliberate in graph construction Signed-off-by: Jim O'Regan <joregan@kth.se> * accept both Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * +2 tests Signed-off-by: Jim O'Regan <joregan@kth.se> * (try to) accept singular quantities for plurals Signed-off-by: Jim O'Regan <joregan@kth.se> * retry Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * oops Signed-off-by: Jim O'Regan <joregan@kth.se> * replace Signed-off-by: Jim O'Regan <joregan@kth.se> * arcmap Signed-off-by: Jim O'Regan <joregan@kth.se> * version without ones Signed-off-by: Jim O'Regan <joregan@kth.se> * add another test Signed-off-by: Jim O'Regan <joregan@kth.se> * change graph Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of this, this is where it goes wrong Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * add a test Signed-off-by: Jim O'Regan <joregan@kth.se> * multiple states from both ones, try removing and readding Signed-off-by: Jim O'Regan <joregan@kth.se> * remove ones, see if that fixes at least the bare quantities Signed-off-by: Jim O'Regan <joregan@kth.se> * works in the repl, dunno why it still breaks Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * move definition Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * another test Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * local declaration, seems to not be working Signed-off-by: Jim O'Regan <joregan@kth.se> * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * match verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix last two failing tests Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing tests for telephone and word Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused variable Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of convert_space, tests fail Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "put convert_spaces back, change test file; pytest fails" This reverts commit a7bb7489137b8026aab02aff64df39e874630043. Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails, take 2 Signed-off-by: Jim O'Regan <joregan@kth.se> * deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk Signed-off-by: Jim O'Regan <joregan@kth.se> * try converting the non-breaking spaces in the shell script Signed-off-by: Jim O'Regan <joregan@kth.se> * wrong place Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * fix path Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused Signed-off-by: Jim O'Regan <joregan@kth.se> * Update date.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Update time.py Signed-off-by: Jim O’Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix comment Signed-off-by: Jim O’Regan <joregan@kth.se> * trim comments Signed-off-by: Jim O’Regan <joregan@kth.se> * remove commented line Signed-off-by: Jim O’Regan <joregan@kth.se> * en halv Signed-off-by: Jim O’Regan <joregan@kth.se> * Update test_sparrowhawk_inverse_text_normalization.sh Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Italian_TN (#67) * add TN italian Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix init Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix LOCATION Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * modify graph_utils Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * correct decimals Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix measure Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh itn (#74) * Add ZH ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix copyrights and code cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove invalid tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve CodeQL issues Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing 'zh' option for ITN and correct comment Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update __init__.py Change to zh instead of en for the imports. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for decimal test data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * update for langauge import Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for Chinese punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * a new class for whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * PYNINI_AVAILABLE = False Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to file import format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * caught duplicates, removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates, arranges for CHInese Yuan updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates accordingly to the comments from last PR. Recreated some of the files due to format issues Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * re-added this file to avoid data file import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated gramamr according to last PR. Removed the acceptance of 千 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated according to last PR. Removed comma after decimal points Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for Fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for money and updated according to last PR. Plus process of 元 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar. updates due to the updates in cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * arrangements Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added whitelist grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * word grammar for non-classified items Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated cardinal, decimal, time, itn data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to last PR Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to the updates for cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for more Mandarin punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR. removing am pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * adjustment on the weight Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the targger updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the time tagger Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to changes in tagger on am and pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * verbalizer for fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for mandarin grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * kept this file because using English utils results in data namin error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * merge conflict Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed import os Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed variables Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue, reccreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue recreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed codeing style/format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed coding style and format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicated graph for 毛 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unnecessary comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * unnecessary comment removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test file updated for more cases Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated with a comment explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated the file explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added Mandarin as zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing for dplication Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused NEMO objects Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix test file failures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix file failtures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failture Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adap to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adapt to grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adopt to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixing pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed // for zhtn/itn cache Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Update inverse_normalize.py Added zh as a selection to pass Jenkins checks. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated pynini_export.py file to create far files (#88) Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * readd Swedish (#87) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn 0712 (#89) * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and fixings according to document on natonal gideline Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fraction updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * money updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * punctuation grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * time gramamr updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * tokenizaer updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates on certificate Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * data updated and added due to updates and chanegs to the existing grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * cardinal updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * date grammar changed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test data added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test python file edits Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for tn1.0 and previous tn grammar from contribution Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test cases updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fixed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * dates updated for init files Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated the date for zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added back the itn tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added back measure and math from previou TN Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated for tests reruns Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated weights Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn char (#95) * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code stle Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * audio-based TN fix for empty pred_text/text (#92) * fix for empty pred_text Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unittests Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix pytest Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * pip 1.2.0 Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * French tn (#91) * add tests for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add fr tn for cardinals, decimals, fractions and ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * delete it far files from tools Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add languages to run_evaluate Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * remove ambiguous spacing Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * enable sh testing for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile cache date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix test for ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update tn cache for fr Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * resolve codeql issues Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Add whitelist_tech.tsv (#96) Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zhitn 0727 (#93) * updates on itn grammar to pass sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats for sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates fro sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fix Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for coding style and sparrowhawk test Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated classes for tests on whitelist and word grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for tests on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_word.py Removed unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Removed imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Removing imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile changed zh cache to 07-27-23 as it is the latest update. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Es tn romans fix (#98) * fix es tn roman exceptions Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update eval script for ITN Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * codeql fix Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Change docker image (#102) Change docker image to one including sparrowhawk Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Print warning instead exception (#97) * raise text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * text arg Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * Failed text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * add logger Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * logger Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * NeMo-text-processing Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * info level Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Normalizer.select_verbalizer Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Exception Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * restart ci Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * warning regardless of verbose flag (#107) * warning Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * self.verbose Signed-off-by: Nikolay Karpov <karpnv@gmail.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Unpin setuptools (#106) Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixed warnings: File is not always closes. (#113) Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix bug #111 (ar currencies) (#117) * fix bug #111 (ar currencies) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci folder Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Logging clean up + IT TN fix (#118) * fix utils and it TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix logging Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * add IT TN to CI Signed-off-by: Evelina <ebakhturina@nvidia.com> * update patch Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Time_IT_TN (#105) * add time verbalizer Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add time tagger and verba Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add pytest time Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeQL Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix numbers with eight Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * rewrote tokenizer Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed the file and replaced it with char in 1.8 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins file update Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * to fix tn bug@ xuesong Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * tn bug Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixeds and updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adjustments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * testing commit Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating etst cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates adapting to graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated cases for SH tests Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added some sentences Signed-off-by: Alex Cui <alexcui1994@gmail.com> * test cases update Signed-off-by: Alex Cui <alexcui1994@gmail.com> * solving rebase issue, repushing changes Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving conflict Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixings according to ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixings according to the ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * notused removing Signed-off-by: Alex Cui <alexcui1994@gmail.com> * format issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * formt issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing unused files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unused files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remiving unsed files; Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added sentences as test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added senetnces as test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed commentyed out tests Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating dates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * attemps to fix bug Signed-off-by: Alex Cui <alexcui1994@gmail.com> * inprocess of fixing the bug Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixing existing issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated graph_utils, tokenize and classify, and word graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added bacl the ppostprocessor far creation Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated NEMO_NOT_ALPHA as a new variable Signed-off-by: Alex Cui <alexcui1994@gmail.com> * far files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * combiedn into measure Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing and combined to meaasure Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing, not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to solve the space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving sh issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving sh test issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adding anands updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * data updated for measure and whitelist Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing fraction and math part Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing preprocessor, updating measure, adding shitelist cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing processor, modification for sp test, shitelist and word Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating zh date Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * realized itn being cvommented out, adding back Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * trying to run zh tn separately because it takes long time to run Signed-off-by: Alex Cui <alexcui1994@gmail.com> * modification to ru zh tn separately Signed-off-by: Alex Cui <alexcui1994@gmail.com> * independent zh tnitn tests for more time Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adding lines to save far file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates for reducing testing time Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * for ounct graph Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing used graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * format and removing used comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing this one, not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remove unused commentss� Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Delete tools/text_processing_deployment/zh directory Removing far files. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * updates according to the github comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * punct grammar Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_cases_cardinal.txt Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Dockerfile Copied from main branch ( which included Anand's updates) Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update launch.sh Found differences in the file. Fixing it back. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Saw word ITN being commented out. Adding it back. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update money.py Found cardinal grammar not accepting suffix. Fixed it. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update Jenkinsfile Removed duplicated zh test from line 230s Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update utils.py Addressing bug raised in bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update graph_utils.py Addressing bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update measure.py Fixing code style, removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Fixing code style, removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update measure.py Removing unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update post_processing.py Removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update post_processing.py Removing unused import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update cardinal.py Deleting unused graph Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Removing import pynini Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py removing pynini import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update verbalize.py removing pynutil import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update post_processing.py removing punct graph imported Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_sparrowhawk_normalization.sh Update on test issue for Docker file locations Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_ordinal.py Fixing style. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Delete nemo_text_processing/text_normalization/zh/taggers/math_symbol.py Removing because it's not one of the semiotic classes. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Delete nemo_text_processing/text_normalization/zh/verbalizers/math_symbol.py Removing because it's not one of the semiotic classes. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile Updating Jenkins date Signed-off-by: Buyuan(Alex) Cui <690…
BuyuanCui
added a commit
that referenced
this pull request
Jul 12, 2024
* Swedish telephone fix (#60) * port fix for telephone from swedish-itn branch Signed-off-by: Jim O'Regan <joregan@kth.se> * extend cardinal in non-deterministic mode Signed-off-by: Jim O'Regan <joregan@kth.se> * whitespace fixes Signed-off-by: Jim O'Regan <joregan@kth.se> * also fix in the verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * Update Jenkinsfile Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * log instead of print in graph_utils.py (#68) Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * CER estimation speedup for audio-based text normalization (#73) * Replaced jiwer with editdistance to speed up CER estimation Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measure coverage for TN and ITN (#62) * add measure coverage for TN and ITN Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update measure.py Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63) * upload es-ES and fr-FR g2p dicts Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add inits Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add NALA Spanish dict Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * rename Spanish and French dictionaries Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add Italian dictionary Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add country codes from hu (#77) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix electronic case for username (#75) * fix electronic username w/o . Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix ar test Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci dirs, enable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * 0.1.8 release (#79) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Codeswitched ES/EN ITN (#78) * Initial commit for ES-EN codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable export for es_en codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add whitelist, update weights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add tests for en_es, zone tagged separately in es Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix path to test data for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkinsfile - enable ES/EN tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add __init__.py files Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix issues with failed docker build - due to archiving of debian and issues with re2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove unused imports and variables Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update date Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable NBSP in sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update copyrights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache path in for ES/EN CI/CD Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * minor normalize.py edit for usability (#84) * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * documentation edits for grammar/clarity Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * added --output_field flag for command line interface Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Swedish ITN (#40) * force two digits for month Signed-off-by: Jim O'Regan <joregan@kth.se> * put it in a function, because I reject the garbage pre-commit.ci came up with Signed-off-by: Jim O'Regan <joregan@kth.se> * wrap some more pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * add graph pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * delete junk Signed-off-by: Jim O'Regan <joregan@kth.se> * my copyright Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser (copy from es) Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * add right tokens Signed-off-by: Jim O'Regan <joregan@kth.se> * some tweaks, more needed Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to ITN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * moved to tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * nothing actually fixed here Signed-off-by: Jim O'Regan <joregan@kth.se> * now most tests pass Signed-off-by: Jim O'Regan <joregan@kth.se> * electronic Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fractions Signed-off-by: Jim O'Regan <joregan@kth.se> * extend Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bare fractions is a bit of an overreach Signed-off-by: Jim O'Regan <joregan@kth.se> * whitelist Signed-off-by: Jim O'Regan <joregan@kth.se> * just inverting the TN whitelist tagger will not work/be useful Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from English Signed-off-by: Jim O'Regan <joregan@kth.se> * overwrite with version from en Signed-off-by: Jim O'Regan <joregan@kth.se> * add basic test case Signed-off-by: Jim O'Regan <joregan@kth.se> * fix call Signed-off-by: Jim O'Regan <joregan@kth.se> * swap tsv sides Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add optional_era variable Signed-off-by: Jim O'Regan <joregan@kth.se> * add test case Signed-off-by: Jim O'Regan <joregan@kth.se> * make deterministic default, like most of the others Signed-off-by: Jim O'Regan <joregan@kth.se> * also add lowercase versions Signed-off-by: Jim O'Regan <joregan@kth.se> * replacing NEMO_SPACE does not work either Signed-off-by: Jim O'Regan <joregan@kth.se> * increasing weight... did not work last time Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaking test cases, in case it was a sentence splitting issue. It was not Signed-off-by: Jim O'Regan <joregan@kth.se> * put the full stops back Signed-off-by: Jim O'Regan <joregan@kth.se> * add filler words Signed-off-by: Jim O'Regan <joregan@kth.se> * try splitting this out to see if it makes a difference Signed-off-by: Jim O'Regan <joregan@kth.se> * aha, this part should be non-deterministic only Signed-off-by: Jim O'Regan <joregan@kth.se> * single line only Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "increasing weight... did not work last time" This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996. Signed-off-by: Jim O'Regan <joregan@kth.se> * disabling ITN here makes TN work again(?) Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "disabling ITN here makes TN work again(?)" This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f. Signed-off-by: Jim O'Regan <joregan@kth.se> * changing the variable name fixes norm tests Signed-off-by: Jim O'Regan <joregan@kth.se> * change the variable names Signed-off-by: Jim O'Regan <joregan@kth.se> * add missing test tooling Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * add a piece for area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add country codes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * extend any_read_digit for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * country/area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * first attempt Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * remove country codes for the time being, makes things ambiguous Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove trailing whitespace Signed-off-by: Jim O'Regan <joregan@kth.se> * Update __init__.py Signed-off-by: Jim O’Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transform of TN tests Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transformation of TN decimal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * slight changes to date Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * include space Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen was not that Signed-off-by: Jim O'Regan <joregan@kth.se> * add functions from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * respect my own copyright xD Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading, this has been an oddity before Signed-off-by: Jim O'Regan <joregan@kth.se> * try changing this year declaration Signed-off-by: Jim O'Regan <joregan@kth.se> * add year + era Signed-off-by: Jim O'Regan <joregan@kth.se> * eliminate more module-level data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "eliminate more module-level data loading" This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a. Signed-off-by: Jim O'Regan <joregan@kth.se> * expose variables Signed-off-by: Jim O'Regan <joregan@kth.se> * extra param for itn mode Signed-off-by: Jim O'Regan <joregan@kth.se> * change call Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * fix parens Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapt/extend tests Signed-off-by: Jim O'Regan <joregan@kth.se> * fix dict init/change keys to something useful Signed-off-by: Jim O'Regan <joregan@kth.se> * initial stab at prefixed numbers Signed-off-by: Jim O'Regan <joregan@kth.se> * some adapting Signed-off-by: Jim O'Regan <joregan@kth.se> * insert kl. if absent Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comments Signed-off-by: Jim O'Regan <joregan@kth.se> * the relative prefixed times Signed-off-by: Jim O'Regan <joregan@kth.se> * + comments Signed-off-by: Jim O'Regan <joregan@kth.se> * enable time Signed-off-by: Jim O'Regan <joregan@kth.se> * space in both directions Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix hours to Signed-off-by: Jim O'Regan <joregan@kth.se> * split by before/after Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * fix if Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. 9 Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from en Signed-off-by: Jim O'Regan <joregan@kth.se> * keep only get_abs_path Signed-off-by: Jim O'Regan <joregan@kth.se> * imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add trimmed file Signed-off-by: Jim O'Regan <joregan@kth.se> * fix imports Signed-off-by: Jim O'Regan <joregan@kth.se> * two abs_paths... could be fun Signed-off-by: Jim O'Regan <joregan@kth.se> * minutes/seconds Signed-off-by: Jim O'Regan <joregan@kth.se> * suffix Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * one optional Signed-off-by: Jim O'Regan <joregan@kth.se> * export variable Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. or one of suffix/zone Signed-off-by: Jim O'Regan <joregan@kth.se> * already disambiguated Signed-off-by: Jim O'Regan <joregan@kth.se> * closure Signed-off-by: Jim O'Regan <joregan@kth.se> * do not insert kl. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix test case Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix spelling Signed-off-by: Jim O'Regan <joregan@kth.se> * Delete measure.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Delete money.py Signed-off-by: Jim O’Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused test pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from es Signed-off-by: Jim O'Regan <joregan@kth.se> * add SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add/update __init__ Signed-off-by: Jim O'Regan <joregan@kth.se> * blank line Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix lang Signed-off-by: Jim O'Regan <joregan@kth.se> * fix decimal verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * remove year, conflicts with cardinal Signed-off-by: Jim O'Regan <joregan@kth.se> * space before, not after Signed-off-by: Jim O'Regan <joregan@kth.se> * fix cardinal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * spurious deletion Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * re-enable SV TN; enable SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "re-enable SV TN; enable SV ITN" This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix singulras Signed-off-by: Jim O'Regan <joregan@kth.se> * add an export Signed-off-by: Jim O'Regan <joregan@kth.se> * change integer graph Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move spaces Signed-off-by: Jim O'Regan <joregan@kth.se> * use cdrewrite Signed-off-by: Jim O'Regan <joregan@kth.se> * just EOS/BOS Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <joregan@kth.se> * omit en/ett, because they are also articles Signed-off-by: Jim O'Regan <joregan@kth.se> * uncomment Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused Signed-off-by: Jim O'Regan <joregan@kth.se> * strip spaces from decimal part Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * partial fix, not what I wanted Signed-off-by: Jim O'Regan <joregan@kth.se> * move comment Signed-off-by: Jim O'Regan <joregan@kth.se> * en/ett cannot work in itn case Signed-off-by: Jim O'Regan <joregan@kth.se> * be more deliberate in graph construction Signed-off-by: Jim O'Regan <joregan@kth.se> * accept both Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * +2 tests Signed-off-by: Jim O'Regan <joregan@kth.se> * (try to) accept singular quantities for plurals Signed-off-by: Jim O'Regan <joregan@kth.se> * retry Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * oops Signed-off-by: Jim O'Regan <joregan@kth.se> * replace Signed-off-by: Jim O'Regan <joregan@kth.se> * arcmap Signed-off-by: Jim O'Regan <joregan@kth.se> * version without ones Signed-off-by: Jim O'Regan <joregan@kth.se> * add another test Signed-off-by: Jim O'Regan <joregan@kth.se> * change graph Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of this, this is where it goes wrong Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * add a test Signed-off-by: Jim O'Regan <joregan@kth.se> * multiple states from both ones, try removing and readding Signed-off-by: Jim O'Regan <joregan@kth.se> * remove ones, see if that fixes at least the bare quantities Signed-off-by: Jim O'Regan <joregan@kth.se> * works in the repl, dunno why it still breaks Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * move definition Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * another test Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * local declaration, seems to not be working Signed-off-by: Jim O'Regan <joregan@kth.se> * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * match verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix last two failing tests Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing tests for telephone and word Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused variable Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of convert_space, tests fail Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "put convert_spaces back, change test file; pytest fails" This reverts commit a7bb7489137b8026aab02aff64df39e874630043. Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails, take 2 Signed-off-by: Jim O'Regan <joregan@kth.se> * deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk Signed-off-by: Jim O'Regan <joregan@kth.se> * try converting the non-breaking spaces in the shell script Signed-off-by: Jim O'Regan <joregan@kth.se> * wrong place Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * fix path Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused Signed-off-by: Jim O'Regan <joregan@kth.se> * Update date.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Update time.py Signed-off-by: Jim O’Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix comment Signed-off-by: Jim O’Regan <joregan@kth.se> * trim comments Signed-off-by: Jim O’Regan <joregan@kth.se> * remove commented line Signed-off-by: Jim O’Regan <joregan@kth.se> * en halv Signed-off-by: Jim O’Regan <joregan@kth.se> * Update test_sparrowhawk_inverse_text_normalization.sh Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Italian_TN (#67) * add TN italian Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix init Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix LOCATION Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * modify graph_utils Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * correct decimals Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix measure Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh itn (#74) * Add ZH ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix copyrights and code cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove invalid tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve CodeQL issues Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing 'zh' option for ITN and correct comment Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update __init__.py Change to zh instead of en for the imports. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for decimal test data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * update for langauge import Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for Chinese punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * a new class for whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * PYNINI_AVAILABLE = False Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to file import format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * caught duplicates, removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates, arranges for CHInese Yuan updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates accordingly to the comments from last PR. Recreated some of the files due to format issues Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * re-added this file to avoid data file import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated gramamr according to last PR. Removed the acceptance of 千 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated according to last PR. Removed comma after decimal points Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for Fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for money and updated according to last PR. Plus process of 元 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar. updates due to the updates in cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * arrangements Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added whitelist grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * word grammar for non-classified items Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated cardinal, decimal, time, itn data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to last PR Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to the updates for cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for more Mandarin punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR. removing am pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * adjustment on the weight Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the targger updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the time tagger Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to changes in tagger on am and pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * verbalizer for fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for mandarin grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * kept this file because using English utils results in data namin error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * merge conflict Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed import os Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed variables Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue, reccreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue recreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed codeing style/format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed coding style and format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicated graph for 毛 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unnecessary comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * unnecessary comment removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test file updated for more cases Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated with a comment explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated the file explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added Mandarin as zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing for dplication Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused NEMO objects Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix test file failures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix file failtures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failture Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adap to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adapt to grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adopt to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixing pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed // for zhtn/itn cache Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Update inverse_normalize.py Added zh as a selection to pass Jenkins checks. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated pynini_export.py file to create far files (#88) Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * readd Swedish (#87) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn 0712 (#89) * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and fixings according to document on natonal gideline Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fraction updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * money updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * punctuation grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * time gramamr updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * tokenizaer updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates on certificate Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * data updated and added due to updates and chanegs to the existing grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * cardinal updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * date grammar changed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test data added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test python file edits Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for tn1.0 and previous tn grammar from contribution Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test cases updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fixed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * dates updated for init files Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated the date for zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added back the itn tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added back measure and math from previou TN Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated for tests reruns Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated weights Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn char (#95) * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code stle Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * audio-based TN fix for empty pred_text/text (#92) * fix for empty pred_text Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unittests Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix pytest Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * pip 1.2.0 Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * French tn (#91) * add tests for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add fr tn for cardinals, decimals, fractions and ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * delete it far files from tools Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add languages to run_evaluate Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * remove ambiguous spacing Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * enable sh testing for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile cache date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix test for ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update tn cache for fr Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * resolve codeql issues Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Add whitelist_tech.tsv (#96) Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zhitn 0727 (#93) * updates on itn grammar to pass sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats for sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates fro sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fix Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for coding style and sparrowhawk test Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated classes for tests on whitelist and word grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for tests on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_word.py Removed unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Removed imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Removing imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile changed zh cache to 07-27-23 as it is the latest update. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Es tn romans fix (#98) * fix es tn roman exceptions Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update eval script for ITN Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * codeql fix Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Change docker image (#102) Change docker image to one including sparrowhawk Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Print warning instead exception (#97) * raise text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * text arg Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * Failed text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * add logger Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * logger Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * NeMo-text-processing Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * info level Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Normalizer.select_verbalizer Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Exception Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * restart ci Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * warning regardless of verbose flag (#107) * warning Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * self.verbose Signed-off-by: Nikolay Karpov <karpnv@gmail.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Unpin setuptools (#106) Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixed warnings: File is not always closes. (#113) Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix bug #111 (ar currencies) (#117) * fix bug #111 (ar currencies) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci folder Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Logging clean up + IT TN fix (#118) * fix utils and it TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix logging Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * add IT TN to CI Signed-off-by: Evelina <ebakhturina@nvidia.com> * update patch Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Time_IT_TN (#105) * add time verbalizer Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add time tagger and verba Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add pytest time Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeQL Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix numbers with eight Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * rewrote tokenizer Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed the file and replaced it with char in 1.8 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins file update Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * to fix tn bug@ xuesong Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * tn bug Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixeds and updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adjustments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * testing commit Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating etst cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates adapting to graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated cases for SH tests Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added some sentences Signed-off-by: Alex Cui <alexcui1994@gmail.com> * test cases update Signed-off-by: Alex Cui <alexcui1994@gmail.com> * solving rebase issue, repushing changes Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving conflict Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixings according to ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixings according to the ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * notused removing Signed-off-by: Alex Cui <alexcui1994@gmail.com> * format issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * formt issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing unused files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unused files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remiving unsed files; Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added sentences as test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added senetnces as test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed commentyed out tests Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating dates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * attemps to fix bug Signed-off-by: Alex Cui <alexcui1994@gmail.com> * inprocess of fixing the bug Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixing existing issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated graph_utils, tokenize and classify, and word graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added bacl the ppostprocessor far creation Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated NEMO_NOT_ALPHA as a new variable Signed-off-by: Alex Cui <alexcui1994@gmail.com> * far files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * combiedn into measure Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing and combined to meaasure Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing, not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to solve the space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving sh issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving sh test issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adding anands updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * data updated for measure and whitelist Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing fraction and math part Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing preprocessor, updating measure, adding shitelist cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing processor, modification for sp test, shitelist and word Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating zh date Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * realized itn being cvommented out, adding back Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * trying to run zh tn separately because it takes long time to run Signed-off-by: Alex Cui <alexcui1994@gmail.com> * modification to ru zh tn separately Signed-off-by: Alex Cui <alexcui1994@gmail.com> * independent zh tnitn tests for more time Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adding lines to save far file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates for reducing testing time Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * for ounct graph Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing used graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * format and removing used comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing this one, not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remove unused commentss� Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Delete tools/text_processing_deployment/zh directory Removing far files. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * updates according to the github comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * punct grammar Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_cases_cardinal.txt Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Dockerfile Copied from main branch ( which included Anand's updates) Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update launch.sh Found differences in the file. Fixing it back. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Saw word ITN being commented out. Adding it back. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update money.py Found cardinal grammar not accepting suffix. Fixed it. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update Jenkinsfile Removed duplicated zh test from line 230s Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update utils.py Addressing bug raised in bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update graph_utils.py Addressing bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update measure.py Fixing code style, removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Fixing code style, removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update measure.py Removing unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+Bu 47D8 yuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update post_processing.py Removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update post_processing.py Removing unused import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update cardinal.py Deleting unused graph Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Removing import pynini Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py removing pynini import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update verbalize.py removing pynutil import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update post_processing.py removing punct graph imported Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_sparrowhawk_normalization.sh Update on test issue for Docker file locations Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_ordinal.py Fixing style. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Delete nemo_text_processing/text_normalization/zh/taggers/math_symbol.py Removing because it's not one of the semiotic classes. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Delete nemo_text_processing/text_normalization/zh/verbalizers/math_symbol.py Removing because it's not one of the semiotic classes. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile Updating Jenkins date Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Jim O’Regan <joregan@kth.se> Co-authored-by: Enno Hermann <Eginhard@users.noreply.github.com> Co-authored-by: Vitaly Lavrukhin <vitaly.lavrukhin@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Enas Albasiri <71229149+ealbasiri@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: lleaver <137942999+lleaver@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Jim O’Regan <jaoregan@tcd.ie> Co-authored-by: Giacomo Leone Maria Cavallini <72698188+GiacomoLeoneMaria@users.noreply.github.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: Nikolay Karpov <karpnv@gmail.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Peter Plantinga <plantinga.peter@proton.me> Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
BuyuanCui
added a commit
that referenced
this pull request
Jul 25, 2024
* Swedish telephone fix (#60) * port fix for telephone from swedish-itn branch Signed-off-by: Jim O'Regan <joregan@kth.se> * extend cardinal in non-deterministic mode Signed-off-by: Jim O'Regan <joregan@kth.se> * whitespace fixes Signed-off-by: Jim O'Regan <joregan@kth.se> * also fix in the verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * Update Jenkinsfile Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * log instead of print in graph_utils.py (#68) Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * CER estimation speedup for audio-based text normalization (#73) * Replaced jiwer with editdistance to speed up CER estimation Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measure coverage for TN and ITN (#62) * add measure coverage for TN and ITN Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update measure.py Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63) * upload es-ES and fr-FR g2p dicts Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add inits Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add NALA Spanish dict Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * rename Spanish and French dictionaries Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add Italian dictionary Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add country codes from hu (#77) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix electronic case for username (#75) * fix electronic username w/o . Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix ar test Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci dirs, enable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * 0.1.8 release (#79) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Codeswitched ES/EN ITN (#78) * Initial commit for ES-EN codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable export for es_en codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add whitelist, update weights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add tests for en_es, zone tagged separately in es Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix path to test data for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkinsfile - enable ES/EN tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add __init__.py files Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix issues with failed docker build - due to archiving of debian and issues with re2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove unused imports and variables Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update date Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable NBSP in sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update copyrights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache path in for ES/EN CI/CD Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * minor normalize.py edit for usability (#84) * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * documentation edits for grammar/clarity Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * added --output_field flag for command line interface Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Swedish ITN (#40) * force two digits for month Signed-off-by: Jim O'Regan <joregan@kth.se> * put it in a function, because I reject the garbage pre-commit.ci came up with Signed-off-by: Jim O'Regan <joregan@kth.se> * wrap some more pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * add graph pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * delete junk Signed-off-by: Jim O'Regan <joregan@kth.se> * my copyright Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser (copy from es) Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * add right tokens Signed-off-by: Jim O'Regan <joregan@kth.se> * some tweaks, more needed Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to ITN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * moved to tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * nothing actually fixed here Signed-off-by: Jim O'Regan <joregan@kth.se> * now most tests pass Signed-off-by: Jim O'Regan <joregan@kth.se> * electronic Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fractions Signed-off-by: Jim O'Regan <joregan@kth.se> * extend Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bare fractions is a bit of an overreach Signed-off-by: Jim O'Regan <joregan@kth.se> * whitelist Signed-off-by: Jim O'Regan <joregan@kth.se> * just inverting the TN whitelist tagger will not work/be useful Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from English Signed-off-by: Jim O'Regan <joregan@kth.se> * overwrite with version from en Signed-off-by: Jim O'Regan <joregan@kth.se> * add basic test case Signed-off-by: Jim O'Regan <joregan@kth.se> * fix call Signed-off-by: Jim O'Regan <joregan@kth.se> * swap tsv sides Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add optional_era variable Signed-off-by: Jim O'Regan <joregan@kth.se> * add test case Signed-off-by: Jim O'Regan <joregan@kth.se> * make deterministic default, like most of the others Signed-off-by: Jim O'Regan <joregan@kth.se> * also add lowercase versions Signed-off-by: Jim O'Regan <joregan@kth.se> * replacing NEMO_SPACE does not work either Signed-off-by: Jim O'Regan <joregan@kth.se> * increasing weight... did not work last time Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaking test cases, in case it was a sentence splitting issue. It was not Signed-off-by: Jim O'Regan <joregan@kth.se> * put the full stops back Signed-off-by: Jim O'Regan <joregan@kth.se> * add filler words Signed-off-by: Jim O'Regan <joregan@kth.se> * try splitting this out to see if it makes a difference Signed-off-by: Jim O'Regan <joregan@kth.se> * aha, this part should be non-deterministic only Signed-off-by: Jim O'Regan <joregan@kth.se> * single line only Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "increasing weight... did not work last time" This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996. Signed-off-by: Jim O'Regan <joregan@kth.se> * disabling ITN here makes TN work again(?) Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "disabling ITN here makes TN work again(?)" This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f. Signed-off-by: Jim O'Regan <joregan@kth.se> * changing the variable name fixes norm tests Signed-off-by: Jim O'Regan <joregan@kth.se> * change the variable names Signed-off-by: Jim O'Regan <joregan@kth.se> * add missing test tooling Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * add a piece for area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add country codes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * extend any_read_digit for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * country/area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * first attempt Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * remove country codes for the time being, makes things ambiguous Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove trailing whitespace Signed-off-by: Jim O'Regan <joregan@kth.se> * Update __init__.py Signed-off-by: Jim O’Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transform of TN tests Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transformation of TN decimal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * slight changes to date Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * include space Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen was not that Signed-off-by: Jim O'Regan <joregan@kth.se> * add functions from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * respect my own copyright xD Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading, this has been an oddity before Signed-off-by: Jim O'Regan <joregan@kth.se> * try changing this year declaration Signed-off-by: Jim O'Regan <joregan@kth.se> * add year + era Signed-off-by: Jim O'Regan <joregan@kth.se> * eliminate more module-level data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "eliminate more module-level data loading" This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a. Signed-off-by: Jim O'Regan <joregan@kth.se> * expose variables Signed-off-by: Jim O'Regan <joregan@kth.se> * extra param for itn mode Signed-off-by: Jim O'Regan <joregan@kth.se> * change call Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * fix parens Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapt/extend tests Signed-off-by: Jim O'Regan <joregan@kth.se> * fix dict init/change keys to something useful Signed-off-by: Jim O'Regan <joregan@kth.se> * initial stab at prefixed numbers Signed-off-by: Jim O'Regan <joregan@kth.se> * some adapting Signed-off-by: Jim O'Regan <joregan@kth.se> * insert kl. if absent Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comments Signed-off-by: Jim O'Regan <joregan@kth.se> * the relative prefixed times Signed-off-by: Jim O'Regan <joregan@kth.se> * + comments Signed-off-by: Jim O'Regan <joregan@kth.se> * enable time Signed-off-by: Jim O'Regan <joregan@kth.se> * space in both directions Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix hours to Signed-off-by: Jim O'Regan <joregan@kth.se> * split by before/after Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * fix if Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. 9 Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from en Signed-off-by: Jim O'Regan <joregan@kth.se> * keep only get_abs_path Signed-off-by: Jim O'Regan <joregan@kth.se> * imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add trimmed file Signed-off-by: Jim O'Regan <joregan@kth.se> * fix imports Signed-off-by: Jim O'Regan <joregan@kth.se> * two abs_paths... could be fun Signed-off-by: Jim O'Regan <joregan@kth.se> * minutes/seconds Signed-off-by: Jim O'Regan <joregan@kth.se> * suffix Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * one optional Signed-off-by: Jim O'Regan <joregan@kth.se> * export variable Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. or one of suffix/zone Signed-off-by: Jim O'Regan <joregan@kth.se> * already disambiguated Signed-off-by: Jim O'Regan <joregan@kth.se> * closure Signed-off-by: Jim O'Regan <joregan@kth.se> * do not insert kl. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix test case Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix spelling Signed-off-by: Jim O'Regan <joregan@kth.se> * Delete measure.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Delete money.py Signed-off-by: Jim O’Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused test pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from es Signed-off-by: Jim O'Regan <joregan@kth.se> * add SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add/update __init__ Signed-off-by: Jim O'Regan <joregan@kth.se> * blank line Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix lang Signed-off-by: Jim O'Regan <joregan@kth.se> * fix decimal verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * remove year, conflicts with cardinal Signed-off-by: Jim O'Regan <joregan@kth.se> * space before, not after Signed-off-by: Jim O'Regan <joregan@kth.se> * fix cardinal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * spurious deletion Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * re-enable SV TN; enable SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "re-enable SV TN; enable SV ITN" This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix singulras Signed-off-by: Jim O'Regan <joregan@kth.se> * add an export Signed-off-by: Jim O'Regan <joregan@kth.se> * change integer graph Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move spaces Signed-off-by: Jim O'Regan <joregan@kth.se> * use cdrewrite Signed-off-by: Jim O'Regan <joregan@kth.se> * just EOS/BOS Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <joregan@kth.se> * omit en/ett, because they are also articles Signed-off-by: Jim O'Regan <joregan@kth.se> * uncomment Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused Signed-off-by: Jim O'Regan <joregan@kth.se> * strip spaces from decimal part Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * partial fix, not what I wanted Signed-off-by: Jim O'Regan <joregan@kth.se> * move comment Signed-off-by: Jim O'Regan <joregan@kth.se> * en/ett cannot work in itn case Signed-off-by: Jim O'Regan <joregan@kth.se> * be more deliberate in graph construction Signed-off-by: Jim O'Regan <joregan@kth.se> * accept both Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * +2 tests Signed-off-by: Jim O'Regan <joregan@kth.se> * (try to) accept singular quantities for plurals Signed-off-by: Jim O'Regan <joregan@kth.se> * retry Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * oops Signed-off-by: Jim O'Regan <joregan@kth.se> * replace Signed-off-by: Jim O'Regan <joregan@kth.se> * arcmap Signed-off-by: Jim O'Regan <joregan@kth.se> * version without ones Signed-off-by: Jim O'Regan <joregan@kth.se> * add another test Signed-off-by: Jim O'Regan <joregan@kth.se> * change graph Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of this, this is where it goes wrong Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * add a test Signed-off-by: Jim O'Regan <joregan@kth.se> * multiple states from both ones, try removing and readding Signed-off-by: Jim O'Regan <joregan@kth.se> * remove ones, see if that fixes at least the bare quantities Signed-off-by: Jim O'Regan <joregan@kth.se> * works in the repl, dunno why it still breaks Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * move definition Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * another test Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * local declaration, seems to not be working Signed-off-by: Jim O'Regan <joregan@kth.se> * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * match verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix last two failing tests Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing tests for telephone and word Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused variable Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of convert_space, tests fail Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "put convert_spaces back, change test file; pytest fails" This reverts commit a7bb7489137b8026aab02aff64df39e874630043. Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails, take 2 Signed-off-by: Jim O'Regan <joregan@kth.se> * deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk Signed-off-by: Jim O'Regan <joregan@kth.se> * try converting the non-breaking spaces in the shell script Signed-off-by: Jim O'Regan <joregan@kth.se> * wrong place Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * fix path Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused Signed-off-by: Jim O'Regan <joregan@kth.se> * Update date.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Update time.py Signed-off-by: Jim O’Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix comment Signed-off-by: Jim O’Regan <joregan@kth.se> * trim comments Signed-off-by: Jim O’Regan <joregan@kth.se> * remove commented line Signed-off-by: Jim O’Regan <joregan@kth.se> * en halv Signed-off-by: Jim O’Regan <joregan@kth.se> * Update test_sparrowhawk_inverse_text_normalization.sh Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Italian_TN (#67) * add TN italian Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix init Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix LOCATION Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * modify graph_utils Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * correct decimals Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix measure Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh itn (#74) * Add ZH ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix copyrights and code cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove invalid tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve CodeQL issues Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing 'zh' option for ITN and correct comment Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update __init__.py Change to zh instead of en for the imports. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for decimal test data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * update for langauge import Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for Chinese punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * a new class for whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * PYNINI_AVAILABLE = False Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to file import format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * caught duplicates, removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates, arranges for CHInese Yuan updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates accordingly to the comments from last PR. Recreated some of the files due to format issues Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * re-added this file to avoid data file import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated gramamr according to last PR. Removed the acceptance of 千 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated according to last PR. Removed comma after decimal points Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for Fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for money and updated according to last PR. Plus process of 元 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar. updates due to the updates in cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * arrangements Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added whitelist grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * word grammar for non-classified items Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated cardinal, decimal, time, itn data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to last PR Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to the updates for cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for more Mandarin punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR. removing am pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * adjustment on the weight Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the targger updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the time tagger Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to changes in tagger on am and pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * verbalizer for fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for mandarin grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * kept this file because using English utils results in data namin error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * merge conflict Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed import os Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed variables Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue, reccreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue recreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed codeing style/format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed coding style and format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicated graph for 毛 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unnecessary comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * unnecessary comment removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test file updated for more cases Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated with a comment explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated the file explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added Mandarin as zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing for dplication Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused NEMO objects Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix test file failures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix file failtures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failture Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adap to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adapt to grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adopt to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixing pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed // for zhtn/itn cache Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Update inverse_normalize.py Added zh as a selection to pass Jenkins checks. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated pynini_export.py file to create far files (#88) Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * readd Swedish (#87) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn 0712 (#89) * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and fixings according to document on natonal gideline Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fraction updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * money updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * punctuation grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * time gramamr updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * tokenizaer updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates on certificate Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * data updated and added due to updates and chanegs to the existing grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * cardinal updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * date grammar changed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test data added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test python file edits Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for tn1.0 and previous tn grammar from contribution Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test cases updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fixed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * dates updated for init files Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated the date for zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https: 10000 //pre-commit.ci * added back the itn tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added back measure and math from previou TN Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated for tests reruns Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated weights Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn char (#95) * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code stle Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * audio-based TN fix for empty pred_text/text (#92) * fix for empty pred_text Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unittests Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix pytest Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * pip 1.2.0 Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * French tn (#91) * add tests for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add fr tn for cardinals, decimals, fractions and ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * delete it far files from tools Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add languages to run_evaluate Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * remove ambiguous spacing Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * enable sh testing for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile cache date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix test for ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update tn cache for fr Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * resolve codeql issues Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Add whitelist_tech.tsv (#96) Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zhitn 0727 (#93) * updates on itn grammar to pass sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats for sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates fro sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fix Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for coding style and sparrowhawk test Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated classes for tests on whitelist and word grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for tests on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_word.py Removed unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Removed imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Removing imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile changed zh cache to 07-27-23 as it is the latest update. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Es tn romans fix (#98) * fix es tn roman exceptions Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update eval script for ITN Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * codeql fix Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Change docker image (#102) Change docker image to one including sparrowhawk Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Print warning instead exception (#97) * raise text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * text arg Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * Failed text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * add logger Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * logger Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * NeMo-text-processing Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * info level Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Normalizer.select_verbalizer Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Exception Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * restart ci Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * warning regardless of verbose flag (#107) * warning Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * self.verbose Signed-off-by: Nikolay Karpov <karpnv@gmail.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Unpin setuptools (#106) Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixed warnings: File is not always closes. (#113) Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix bug #111 (ar currencies) (#117) * fix bug #111 (ar currencies) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci folder Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Logging clean up + IT TN fix (#118) * fix utils and it TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix logging Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * add IT TN to CI Signed-off-by: Evelina <ebakhturina@nvidia.com> * update patch Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Time_IT_TN (#105) * add time verbalizer Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add time tagger and verba Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add pytest time Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeQL Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix numbers with eight Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * rewrote tokenizer Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed the file and replaced it with char in 1.8 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins file update Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * to fix tn bug@ xuesong Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * tn bug Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixeds and updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adjustments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * testing commit Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating etst cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates adapting to graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated cases for SH tests Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added some sentences Signed-off-by: Alex Cui <alexcui1994@gmail.com> * test cases update Signed-off-by: Alex Cui <alexcui1994@gmail.com> * solving rebase issue, repushing changes Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving conflict Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixings according to ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixings according to the ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * notused removing Signed-off-by: Alex Cui <alexcui1994@gmail.com> * format issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * formt issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing unused files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unused files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remiving unsed files; Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added sentences as test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added senetnces as test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed commentyed out tests Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating dates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * attemps to fix bug Signed-off-by: Alex Cui <alexcui1994@gmail.com> * inprocess of fixing the bug Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixing existing issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated graph_utils, tokenize and classify, and word graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added bacl the ppostprocessor far creation Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated NEMO_NOT_ALPHA as a new variable Signed-off-by: Alex Cui <alexcui1994@gmail.com> * far files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * combiedn into measure Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing and combined to meaasure Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing, not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to solve the space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving sh issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving sh test issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adding anands updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * data updated for measure and whitelist Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing fraction and math part Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing preprocessor, updating measure, adding shitelist cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing processor, modification for sp test, shitelist and word Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating zh date Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * realized itn being cvommented out, adding back Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * trying to run zh tn separately because it takes long time to run Signed-off-by: Alex Cui <alexcui1994@gmail.com> * modification to ru zh tn separately Signed-off-by: Alex Cui <alexcui1994@gmail.com> * independent zh tnitn tests for more time Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adding lines to save far file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates for reducing testing time Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * for ounct graph Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing used graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * format and removing used comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing this one, not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remove unused commentss� Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Delete tools/text_processing_deployment/zh directory Removing far files. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * updates according to the github comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * punct grammar Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_cases_cardinal.txt Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Dockerfile Copied from main branch ( which included Anand's updates) Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update launch.sh Found differences in the file. Fixing it back. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Saw word ITN being commented out. Adding it back. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update money.py Found cardinal grammar not accepting suffix. Fixed it. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update Jenkinsfile Removed duplicated zh test from line 230s Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update utils.py Addressing bug raised in bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update graph_utils.py Addressing bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update measure.py Fixing code style, removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Fixing code style, removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update measure.py Removing unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update post_processing.py Removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update post_processing.py Removing unused import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update cardinal.py Deleting unused graph Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Removing import pynini Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py removing pynini import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update verbalize.py removing pynutil import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update post_processing.py removing punct graph imported Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_sparrowhawk_normalization.sh Update on test issue for Docker file locations Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_ordinal.py Fixing style. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Delete nemo_text_processing/text_normalization/zh/taggers/math_symbol.py Removing because it's not one of the semiotic classes. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Delete nemo_text_processing/text_normalization/zh/verbalizers/math_symbol.py Removing because it's not one of the semiotic classes. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile Updating Jenkins date Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Jim O’Regan <joregan@kth.se> Co-authored-by: Enno Hermann <Eginhard@users.noreply.github.com> Co-authored-by: Vitaly Lavrukhin <vitaly.lavrukhin@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Enas Albasiri <71229149+ealbasiri@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: lleaver <137942999+lleaver@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Jim O’Regan <jaoregan@tcd.ie> Co-authored-by: Giacomo Leone Maria Cavallini <72698188+GiacomoLeoneMaria@users.noreply.github.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: Nikolay Karpov <karpnv@gmail.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Peter Plantinga <plantinga.peter@proton.me> Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
tbartley94
added a commit
that referenced
this pull request
Aug 16, 2024
* IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix broken path for nondet whitelist (#124) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Increase weights for serial (en TN) (#128) * Increase weights for serial (en TN) Resolves https://github.com/NVIDIA/NeMo-text-processing/issues/126 Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Add tests for fix Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile cache path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile. Fix cache folder Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measures file for FR TN (#131) * add measures file Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update whitelist data Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * add fr tn tests Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh jenkins (#127) * Add SH tests to Jenkins Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkins tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add CI/CD tests for sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * docker build only if in test mode Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing variable Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix comments and remove arguments not required Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix commands not executing Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing arguments Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing quotes Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix incorrect path for tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Incorrect paths of tests and shunit2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix issues with paths as arguments to shunit Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Undo path change Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix intentional fail test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * revert redundant check for cased option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix default path in export_grammars.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add interactive option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add SH tests for cased EN ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update isort - fix precommit (#138) * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused imports Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian itn (#136) * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context for tests and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Revert "Added context for tests and fixed CodeQL errors" This reverts commit 2c804d941963c0be21d3aad07e6cd13568ab747b. Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context to some test files and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unnecessary data Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * translated a few measurements to Armenian Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * adjusted some things for better readability and maintainer support Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed one test case and some issues Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix CI (#142) * fix whitelist deployment Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * comment out tests to recreate grammars Signed-off-by: Evelina <ebakhturina@nvidia.com> * shorten test Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix jenkins Signed-off-by: Evelina <ebakhturina@nvidia.com> * cased for TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * revert debug changes Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix args default Signed-off-by: Evelina <ebakhturina@nvidia.com> * try parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * debug parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix sh tests for local SH launcher Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian TN (#137) * merged with main branch and fixed conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing some more conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixed a minor issue Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unused imports Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix: add "hy" language option for armenian Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> * added optional space for measurements after cardinals/decimals Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * added Armenian dot Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Marathi ITN (#134) * Added Marathi ITN Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adding jenkins test Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Travis Bartley <tbartley@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins fix (#150) * jenkins fix Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * missing _init_ for python Signed-off-by: Travis Bartley <tbartl 97AE ey@nvidia.com> * mislabled cache Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * r0.3.0 release (#151) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix text=line[text] to text=line[text_field] (#153) Signed-off-by: Sasha Meister <sasha.meister.work@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * use real string on docstring (#157) Signed-off-by: Kevin Sanders <kevin.sanders@dialpad.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh postprocess (#147) * Add support for postprocessor far in sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Choose between having a post processor or not Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update run_evaluate script for cased itn (#164) * update run_evaluate script for cased itn Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remove unused function from ar tn decimals (#165) * remove unused function from ar tn decimals Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * ZH sentence-level TN (#112) * Swedish telephone fix (#60) * port fix for telephone from swedish-itn branch Signed-off-by: Jim O'Regan <joregan@kth.se> * extend cardinal in non-deterministic mode Signed-off-by: Jim O'Regan <joregan@kth.se> * whitespace fixes Signed-off-by: Jim O'Regan <joregan@kth.se> * also fix in the verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * Update Jenkinsfile Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * log instead of print in graph_utils.py (#68) Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * CER estimation speedup for audio-based text normalization (#73) * Replaced jiwer with editdistance to speed up CER estimation Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measure coverage for TN and ITN (#62) * add measure coverage for TN and ITN Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update measure.py Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63) * upload es-ES and fr-FR g2p dicts Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add inits Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add NALA Spanish dict Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * rename Spanish and French dictionaries Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add Italian dictionary Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add country codes from hu (#77) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix electronic case for username (#75) * fix electronic username w/o . Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix ar test Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci dirs, enable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * 0.1.8 release (#79) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Codeswitched ES/EN ITN (#78) * Initial commit for ES-EN codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable export for es_en codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add whitelist, update weights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add tests for en_es, zone tagged separately in es Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix path to test data for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkinsfile - enable ES/EN tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add __init__.py files Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix issues with failed docker build - due to archiving of debian and issues with re2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove unused imports and variables Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update date Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable NBSP in sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update copyrights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache path in for ES/EN CI/CD Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * minor normalize.py edit for usability (#84) * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * documentation edits for grammar/clarity Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * added --output_field flag for command line interface Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Swedish ITN (#40) * force two digits for month Signed-off-by: Jim O'Regan <joregan@kth.se> * put it in a function, because I reject the garbage pre-commit.ci came up with Signed-off-by: Jim O'Regan <joregan@kth.se> * wrap some more pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * add graph pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * delete junk Signed-off-by: Jim O'Regan <joregan@kth.se> * my copyright Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser (copy from es) Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * add right tokens Signed-off-by: Jim O'Regan <joregan@kth.se> * some tweaks, more needed Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to ITN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * moved to tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * nothing actually fixed here Signed-off-by: Jim O'Regan <joregan@kth.se> * now most tests pass Signed-off-by: Jim O'Regan <joregan@kth.se> * electronic Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fractions Signed-off-by: Jim O'Regan <joregan@kth.se> * extend Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bare fractions is a bit of an overreach Signed-off-by: Jim O'Regan <joregan@kth.se> * whitelist Signed-off-by: Jim O'Regan <joregan@kth.se> * just inverting the TN whitelist tagger will not work/be useful Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from English Signed-off-by: Jim O'Regan <joregan@kth.se> * overwrite with version from en Signed-off-by: Jim O'Regan <joregan@kth.se> * add basic test case Signed-off-by: Jim O'Regan <joregan@kth.se> * fix call Signed-off-by: Jim O'Regan <joregan@kth.se> * swap tsv sides Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add optional_era variable Signed-off-by: Jim O'Regan <joregan@kth.se> * add test case Signed-off-by: Jim O'Regan <joregan@kth.se> * make deterministic default, like most of the others Signed-off-by: Jim O'Regan <joregan@kth.se> * also add lowercase versions Signed-off-by: Jim O'Regan <joregan@kth.se> * replacing NEMO_SPACE does not work either Signed-off-by: Jim O'Regan <joregan@kth.se> * increasing weight... did not work last time Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaking test cases, in case it was a sentence splitting issue. It was not Signed-off-by: Jim O'Regan <joregan@kth.se> * put the full stops back Signed-off-by: Jim O'Regan <joregan@kth.se> * add filler words Signed-off-by: Jim O'Regan <joregan@kth.se> * try splitting this out to see if it makes a difference Signed-off-by: Jim O'Regan <joregan@kth.se> * aha, this part should be non-deterministic only Signed-off-by: Jim O'Regan <joregan@kth.se> * single line only Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "increasing weight... did not work last time" This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996. Signed-off-by: Jim O'Regan <joregan@kth.se> * disabling ITN here makes TN work again(?) Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "disabling ITN here makes TN work again(?)" This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f. Signed-off-by: Jim O'Regan <joregan@kth.se> * changing the variable name fixes norm tests Signed-off-by: Jim O'Regan <joregan@kth.se> * change the variable names Signed-off-by: Jim O'Regan <joregan@kth.se> * add missing test tooling Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * add a piece for area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add country codes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * extend any_read_digit for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * country/area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * first attempt Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * remove country codes for the time being, makes things ambiguous Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove trailing whitespace Signed-off-by: Jim O'Regan <joregan@kth.se> * Update __init__.py Signed-off-by: Jim O’Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transform of TN tests Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transformation of TN decimal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * slight changes to date Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * include space Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen was not that Signed-off-by: Jim O'Regan <joregan@kth.se> * add functions from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * respect my own copyright xD Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading, this has been an oddity before Signed-off-by: Jim O'Regan <joregan@kth.se> * try changing this year declaration Signed-off-by: Jim O'Regan <joregan@kth.se> * add year + era Signed-off-by: Jim O'Regan <joregan@kth.se> * eliminate more module-level data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "eliminate more module-level data loading" This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a. Signed-off-by: Jim O'Regan <joregan@kth.se> * expose variables Signed-off-by: Jim O'Regan <joregan@kth.se> * extra param for itn mode Signed-off-by: Jim O'Regan <joregan@kth.se> * change call Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * fix parens Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapt/extend tests Signed-off-by: Jim O'Regan <joregan@kth.se> * fix dict init/change keys to something useful Signed-off-by: Jim O'Regan <joregan@kth.se> * initial stab at prefixed numbers Signed-off-by: Jim O'Regan <joregan@kth.se> * some adapting Signed-off-by: Jim O'Regan <joregan@kth.se> * insert kl. if absent Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comments Signed-off-by: Jim O'Regan <joregan@kth.se> * the relative prefixed times Signed-off-by: Jim O'Regan <joregan@kth.se> * + comments Signed-off-by: Jim O'Regan <joregan@kth.se> * enable time Signed-off-by: Jim O'Regan <joregan@kth.se> * space in both directions Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix hours to Signed-off-by: Jim O'Regan <joregan@kth.se> * split by before/after Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * fix if Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. 9 Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from en Signed-off-by: Jim O'Regan <joregan@kth.se> * keep only get_abs_path Signed-off-by: Jim O'Regan <joregan@kth.se> * imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add trimmed file Signed-off-by: Jim O'Regan <joregan@kth.se> * fix imports Signed-off-by: Jim O'Regan <joregan@kth.se> * two abs_paths... could be fun Signed-off-by: Jim O'Regan <joregan@kth.se> * minutes/seconds Signed-off-by: Jim O'Regan <joregan@kth.se> * suffix Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * one optional Signed-off-by: Jim O'Regan <joregan@kth.se> * export variable Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. or one of suffix/zone Signed-off-by: Jim O'Regan <joregan@kth.se> * already disambiguated Signed-off-by: Jim O'Regan <joregan@kth.se> * closure Signed-off-by: Jim O'Regan <joregan@kth.se> * do not insert kl. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix test case Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix spelling Signed-off-by: Jim O'Regan <joregan@kth.se> * Delete measure.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Delete money.py Signed-off-by: Jim O’Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused test pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from es Signed-off-by: Jim O'Regan <joregan@kth.se> * add SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add/update __init__ Signed-off-by: Jim O'Regan <joregan@kth.se> * blank line Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix lang Signed-off-by: Jim O'Regan <joregan@kth.se> * fix decimal verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * remove year, conflicts with cardinal Signed-off-by: Jim O'Regan <joregan@kth.se> * space before, not after Signed-off-by: Jim O'Regan <joregan@kth.se> * fix cardinal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * spurious deletion Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * re-enable SV TN; enable SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "re-enable SV TN; enable SV ITN" This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix singulras Signed-off-by: Jim O'Regan <joregan@kth.se> * add an export Signed-off-by: Jim O'Regan <joregan@kth.se> * change integer graph Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move spaces Signed-off-by: Jim O'Regan <joregan@kth.se> * use cdrewrite Signed-off-by: Jim O'Regan <joregan@kth.se> * just EOS/BOS Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <joregan@kth.se> * omit en/ett, because they are also articles Signed-off-by: Jim O'Regan <joregan@kth.se> * uncomment Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused Signed-off-by: Jim O'Regan <joregan@kth.se> * strip spaces from decimal part Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * partial fix, not what I wanted Signed-off-by: Jim O'Regan <joregan@kth.se> * move comment Signed-off-by: Jim O'Regan <joregan@kth.se> * en/ett cannot work in itn case Signed-off-by: Jim O'Regan <joregan@kth.se> * be more deliberate in graph construction Signed-off-by: Jim O'Regan <joregan@kth.se> * accept both Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * +2 tests Signed-off-by: Jim O'Regan <joregan@kth.se> * (try to) accept singular quantities for plurals Signed-off-by: Jim O'Regan <joregan@kth.se> * retry Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * oops Signed-off-by: Jim O'Regan <joregan@kth.se> * replace Signed-off-by: Jim O'Regan <joregan@kth.se> * arcmap Signed-off-by: Jim O'Regan <joregan@kth.se> * version without ones Signed-off-by: Jim O'Regan <joregan@kth.se> * add another test Signed-off-by: Jim O'Regan <joregan@kth.se> * change graph Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of this, this is where it goes wrong Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * add a test Signed-off-by: Jim O'Regan <joregan@kth.se> * multiple states from both ones, try removing and readding Signed-off-by: Jim O'Regan <joregan@kth.se> * remove ones, see if that fixes at least the bare quantities Signed-off-by: Jim O'Regan <joregan@kth.se> * works in the repl, dunno why it still breaks Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * move definition Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * another test Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * local declaration, seems to not be working Signed-off-by: Jim O'Regan <joregan@kth.se> * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * match verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix last two failing tests Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing tests for telephone and word Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused variable Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of convert_space, tests fail Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "put convert_spaces back, change test file; pytest fails" This reverts commit a7bb7489137b8026aab02aff64df39e874630043. Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails, take 2 Signed-off-by: Jim O'Regan <joregan@kth.se> * deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk Signed-off-by: Jim O'Regan <joregan@kth.se> * try converting the non-breaking spaces in the shell script Signed-off-by: Jim O'Regan <joregan@kth.se> * wrong place Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * fix path Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused Signed-off-by: Jim O'Regan <joregan@kth.se> * Update date.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Update time.py Signed-off-by: Jim O’Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix comment Signed-off-by: Jim O’Regan <joregan@kth.se> * trim comments Signed-off-by: Jim O’Regan <joregan@kth.se> * remove commented line Signed-off-by: Jim O’Regan <joregan@kth.se> * en halv Signed-off-by: Jim O’Regan <joregan@kth.se> * Update test_sparrowhawk_inverse_text_normalization.sh Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Italian_TN (#67) * add TN italian Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix init Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix LOCATION Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * modify graph_utils Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * correct decimals Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix measure Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh itn (#74) * Add ZH ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix copyrights and code cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove invalid tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve CodeQL issues Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing 'zh' option for ITN and correct comment Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update __init__.py Change to zh instead of en for the imports. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for decimal test data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * update for langauge import Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for Chinese punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * a new class for whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * PYNINI_AVAILABLE = False Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to file import format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * caught duplicates, removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates, arranges for CHInese Yuan updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates accordingly to the comments from last PR. Recreated some of the files due to format issues Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * re-added this file to avoid data file import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated gramamr according to last PR. Removed the acceptance of 千 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated according to last PR. Removed comma after decimal points Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for Fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for money and updated according to last PR. Plus process of 元 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar. updates due to the updates in cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * arrangements Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added whitelist grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * word grammar for non-classified items Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated cardinal, decimal, time, itn data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to last PR Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to the updates for cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for more Mandarin punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR. removing am pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * adjustment on the weight Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the targger updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the time tagger Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to changes in tagger on am and pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * verbalizer for fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for mandarin grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * kept this file because using English utils results in data namin error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * merge conflict Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed import os Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed variables Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue, reccreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue recreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed codeing style/format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed coding style and format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicated graph for 毛 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unnecessary comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * unnecessary comment removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test file updated for more cases Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated with a comment explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated the file explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added Mandarin as zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing for dplication Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused NEMO objects Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix test file failures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix file failtures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failture Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adap to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adapt to grammar changes 741A Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adopt to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixing pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed // for zhtn/itn cache Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Update inverse_normalize.py Added zh as a selection to pass Jenkins checks. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated pynini_export.py file to create far files (#88) Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * readd Swedish (#87) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn 0712 (#89) * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and fixings according to document on natonal gideline Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fraction updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * money updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * punctuation grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * time gramamr updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * tokenizaer updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates on certificate Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * data updated and added due to updates and chanegs to the existing grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * cardinal updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * date grammar changed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test data added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test python file edits Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for tn1.0 and previous tn grammar from contribution Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test cases updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fixed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * dates updated for init files Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated the date for zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added back the itn tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added back measure and math from previou TN Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated for tests reruns Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated weights Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn char (#95) * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code stle Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * audio-based TN fix for empty pred_text/text (#92) * fix for empty pred_text Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unittests Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix pytest Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * pip 1.2.0 Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * French tn (#91) * add tests for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add fr tn for cardinals, decimals, fractions and ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * delete it far files from tools Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add languages to run_evaluate Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * remove ambiguous spacing Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * enable sh testing for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile cache date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix test for ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update tn cache for fr Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * resolve codeql issues Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Add whitelist_tech.tsv (#96) Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zhitn 0727 (#93) * updates on itn grammar to pass sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats for sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates fro sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fix Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for coding style and sparrowhawk test Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated classes for tests on whitelist and word grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for tests on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_word.py Removed unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Removed imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Removing imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile changed zh cache to 07-27-23 as it is the latest update. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Es tn romans fix (#98) * fix es tn roman exceptions Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update eval script for ITN Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * codeql fix Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Change docker image (#102) Change docker image to one including sparrowhawk Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Print warning instead exception (#97) * raise text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * text arg Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * Failed text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * add logger Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * logger Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * NeMo-text-processing Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * info level Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Normalizer.select_verbalizer Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Exception Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * restart ci Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * warning regardless of verbose flag (#107) * warning Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * self.verbose Signed-off-by: Nikolay Karpov <karpnv@gmail.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Unpin setuptools (#106) Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixed warnings: File is not always closes. (#113) Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix bug #111 (ar currencies) (#117) * fix bug #111 (ar currencies) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci folder Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Logging clean up + IT TN fix (#118) * fix utils and it TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix logging Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * add IT TN to CI Signed-off-by: Evelina <ebakhturina@nvidia.com> * update patch Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Time_IT_TN (#105) * add time verbalizer Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add time tagger and verba Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add pytest time Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeQL Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix numbers with eight Signed-off-by: Gia…
BuyuanCui
added a commit
that referenced
this pull request
Aug 20, 2024
* Swedish telephone fix (#60) * port fix for telephone from swedish-itn branch Signed-off-by: Jim O'Regan <joregan@kth.se> * extend cardinal in non-deterministic mode Signed-off-by: Jim O'Regan <joregan@kth.se> * whitespace fixes Signed-off-by: Jim O'Regan <joregan@kth.se> * also fix in the verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * Update Jenkinsfile Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * log instead of print in graph_utils.py (#68) Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * CER estimation speedup for audio-based text normalization (#73) * Replaced jiwer with editdistance to speed up CER estimation Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measure coverage for TN and ITN (#62) * add measure coverage for TN and ITN Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update measure.py Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63) * upload es-ES and fr-FR g2p dicts Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add inits Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add NALA Spanish dict Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * rename Spanish and French dictionaries Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add Italian dictionary Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add country codes from hu (#77) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix electronic case for username (#75) * fix electronic username w/o . Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix ar test Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci dirs, enable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * 0.1.8 release (#79) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Codeswitched ES/EN ITN (#78) * Initial commit for ES-EN codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable export for es_en codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add whitelist, update weights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add tests for en_es, zone tagged separately in es Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix path to test data for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkinsfile - enable ES/EN tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add __init__.py files Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix issues with failed docker build - due to archiving of debian and issues with re2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove unused imports and variables Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update date Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable NBSP in sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update copyrights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache path in for ES/EN CI/CD Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * minor normalize.py edit for usability (#84) * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * documentation edits for grammar/clarity Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * added --output_field flag for command line interface Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Swedish ITN (#40) * force two digits for month Signed-off-by: Jim O'Regan <joregan@kth.se> * put it in a function, because I reject the garbage pre-commit.ci came up with Signed-off-by: Jim O'Regan <joregan@kth.se> * wrap some more pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * add graph pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * delete junk Signed-off-by: Jim O'Regan <joregan@kth.se> * my copyright Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser (copy from es) Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * add right tokens Signed-off-by: Jim O'Regan <joregan@kth.se> * some tweaks, more needed Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to ITN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * moved to tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * nothing actually fixed here Signed-off-by: Jim O'Regan <joregan@kth.se> * now most tests pass Signed-off-by: Jim O'Regan <joregan@kth.se> * electronic Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fractions Signed-off-by: Jim O'Regan <joregan@kth.se> * extend Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bare fractions is a bit of an overreach Signed-off-by: Jim O'Regan <joregan@kth.se> * whitelist Signed-off-by: Jim O'Regan <joregan@kth.se> * just inverting the TN whitelist tagger will not work/be useful Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from English Signed-off-by: Jim O'Regan <joregan@kth.se> * overwrite with version from en Signed-off-by: Jim O'Regan <joregan@kth.se> * add basic test case Signed-off-by: Jim O'Regan <joregan@kth.se> * fix call Signed-off-by: Jim O'Regan <joregan@kth.se> * swap tsv sides Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add optional_era variable Signed-off-by: Jim O'Regan <joregan@kth.se> * add test case Signed-off-by: Jim O'Regan <joregan@kth.se> * make deterministic default, like most of the others Signed-off-by: Jim O'Regan <joregan@kth.se> * also add lowercase versions Signed-off-by: Jim O'Regan <joregan@kth.se> * replacing NEMO_SPACE does not work either Signed-off-by: Jim O'Regan <joregan@kth.se> * increasing weight... did not work last time Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaking test cases, in case it was a sentence splitting issue. It was not Signed-off-by: Jim O'Regan <joregan@kth.se> * put the full stops back Signed-off-by: Jim O'Regan <joregan@kth.se> * add filler words Signed-off-by: Jim O'Regan <joregan@kth.se> * try splitting this out to see if it makes a difference Signed-off-by: Jim O'Regan <joregan@kth.se> * aha, this part should be non-deterministic only Signed-off-by: Jim O'Regan <joregan@kth.se> * single line only Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "increasing weight... did not work last time" This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996. Signed-off-by: Jim O'Regan <joregan@kth.se> * disabling ITN here makes TN work again(?) Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "disabling ITN here makes TN work again(?)" This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f. Signed-off-by: Jim O'Regan <joregan@kth.se> * changing the variable name fixes norm tests Signed-off-by: Jim O'Regan <joregan@kth.se> * change the variable names Signed-off-by: Jim O'Regan <joregan@kth.se> * add missing test tooling Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * add a piece for area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add country codes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * extend any_read_digit for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * country/area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * first attempt Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * remove country codes for the time being, makes things ambiguous Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove trailing whitespace Signed-off-by: Jim O'Regan <joregan@kth.se> * Update __init__.py Signed-off-by: Jim O’Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transform of TN tests Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transformation of TN decimal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * slight changes to date Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * include space Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen was not that Signed-off-by: Jim O'Regan <joregan@kth.se> * add functions from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * respect my own copyright xD Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading, this has been an oddity before Signed-off-by: Jim O'Regan <joregan@kth.se> * try changing this year declaration Signed-off-by: Jim O'Regan <joregan@kth.se> * add year + era Signed-off-by: Jim O'Regan <joregan@kth.se> * eliminate more module-level data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "eliminate more module-level data loading" This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a. Signed-off-by: Jim O'Regan <joregan@kth.se> * expose variables Signed-off-by: Jim O'Regan <joregan@kth.se> * extra param for itn mode Signed-off-by: Jim O'Regan <joregan@kth.se> * change call Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * fix parens Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapt/extend tests Signed-off-by: Jim O'Regan <joregan@kth.se> * fix dict init/change keys to something useful Signed-off-by: Jim O'Regan <joregan@kth.se> * initial stab at prefixed numbers Signed-off-by: Jim O'Regan <joregan@kth.se> * some adapting Signed-off-by: Jim O'Regan <joregan@kth.se> * insert kl. if absent Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comments Signed-off-by: Jim O'Regan <joregan@kth.se> * the relative prefixed times Signed-off-by: Jim O'Regan <joregan@kth.se> * + comments Signed-off-by: Jim O'Regan <joregan@kth.se> * enable time Signed-off-by: Jim O'Regan <joregan@kth.se> * space in both directions Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix hours to Signed-off-by: Jim O'Regan <joregan@kth.se> * split by before/after Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * fix if Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. 9 Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from en Signed-off-by: Jim O'Regan <joregan@kth.se> * keep only get_abs_path Signed-off-by: Jim O'Regan <joregan@kth.se> * imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add trimmed file Signed-off-by: Jim O'Regan <joregan@kth.se> * fix imports Signed-off-by: Jim O'Regan <joregan@kth.se> * two abs_paths... could be fun Signed-off-by: Jim O'Regan <joregan@kth.se> * minutes/seconds Signed-off-by: Jim O'Regan <joregan@kth.se> * suffix Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * one optional Signed-off-by: Jim O'Regan <joregan@kth.se> * export variable Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. or one of suffix/zone Signed-off-by: Jim O'Regan <joregan@kth.se> * already disambiguated Signed-off-by: Jim O'Regan <joregan@kth.se> * closure Signed-off-by: Jim O'Regan <joregan@kth.se> * do not insert kl. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix test case Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix spelling Signed-off-by: Jim O'Regan <joregan@kth.se> * Delete measure.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Delete money.py Signed-off-by: Jim O’Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused test pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from es Signed-off-by: Jim O'Regan <joregan@kth.se> * add SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add/update __init__ Signed-off-by: Jim O'Regan <joregan@kth.se> * blank line Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix lang Signed-off-by: Jim O'Regan <joregan@kth.se> * fix decimal verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * remove year, conflicts with cardinal Signed-off-by: Jim O'Regan <joregan@kth.se> * space before, not after Signed-off-by: Jim O'Regan <joregan@kth.se> * fix cardinal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * spurious deletion Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * re-enable SV TN; enable SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "re-enable SV TN; enable SV ITN" This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix singulras Signed-off-by: Jim O'Regan <joregan@kth.se> * add an export Signed-off-by: Jim O'Regan <joregan@kth.se> * change integer graph Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move spaces Signed-off-by: Jim O'Regan <joregan@kth.se> * use cdrewrite Signed-off-by: Jim O'Regan <joregan@kth.se> * just EOS/BOS Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <joregan@kth.se> * omit en/ett, because they are also articles Signed-off-by: Jim O'Regan <joregan@kth.se> * uncomment Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused Signed-off-by: Jim O'Regan <joregan@kth.se> * strip spaces from decimal part Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * partial fix, not what I wanted Signed-off-by: Jim O'Regan <joregan@kth.se> * move comment Signed-off-by: Jim O'Regan <joregan@kth.se> * en/ett cannot work in itn case Signed-off-by: Jim O'Regan <joregan@kth.se> * be more deliberate in graph construction Signed-off-by: Jim O'Regan <joregan@kth.se> * accept both Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * +2 tests Signed-off-by: Jim O'Regan <joregan@kth.se> * (try to) accept singular quantities for plurals Signed-off-by: Jim O'Regan <joregan@kth.se> * retry Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * oops Signed-off-by: Jim O'Regan <joregan@kth.se> * replace Signed-off-by: Jim O'Regan <joregan@kth.se> * arcmap Signed-off-by: Jim O'Regan <joregan@kth.se> * version without ones Signed-off-by: Jim O'Regan <joregan@kth.se> * add another test Signed-off-by: Jim O'Regan <joregan@kth.se> * change graph Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of this, this is where it goes wrong Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * add a test Signed-off-by: Jim O'Regan <joregan@kth.se> * multiple states from both ones, try removing and readding Signed-off-by: Jim O'Regan <joregan@kth.se> * remove ones, see if that fixes at least the bare quantities Signed-off-by: Jim O'Regan <joregan@kth.se> * works in the repl, dunno why it still breaks Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * move definition Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * another test Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * local declaration, seems to not be working Signed-off-by: Jim O'Regan <joregan@kth.se> * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * match verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix last two failing tests Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing tests for telephone and word Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused variable Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of convert_space, tests fail Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "put convert_spaces back, change test file; pytest fails" This reverts commit a7bb7489137b8026aab02aff64df39e874630043. Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails, take 2 Signed-off-by: Jim O'Regan <joregan@kth.se> * deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk Signed-off-by: Jim O'Regan <joregan@kth.se> * try converting the non-breaking spaces in the shell script Signed-off-by: Jim O'Regan <joregan@kth.se> * wrong place Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * fix path Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused Signed-off-by: Jim O'Regan <joregan@kth.se> * Update date.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Update time.py Signed-off-by: Jim O’Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix comment Signed-off-by: Jim O’Regan <joregan@kth.se> * trim comments Signed-off-by: Jim O’Regan <joregan@kth.se> * remove commented line Signed-off-by: Jim O’Regan <joregan@kth.se> * en halv Signed-off-by: Jim O’Regan <joregan@kth.se> * Update test_sparrowhawk_inverse_text_normalization.sh Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Italian_TN (#67) * add TN italian Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix init Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix LOCATION Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * modify graph_utils Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * correct decimals Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix measure Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh itn (#74) * Add ZH ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix copyrights and code cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove invalid tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve CodeQL issues Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing 'zh' option for ITN and correct comment Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update __init__.py Change to zh instead of en for the imports. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for decimal test data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * update for langauge import Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for Chinese punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * a new class for whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * PYNINI_AVAILABLE = False Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to file import format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * caught duplicates, removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates, arranges for CHInese Yuan updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates accordingly to the comments from last PR. Recreated some of the files due to format issues Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * re-added this file to avoid data file import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated gramamr according to last PR. Removed the acceptance of 千 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated according to last PR. Removed comma after decimal points Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for Fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for money and updated according to last PR. Plus process of 元 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar. updates due to the updates in cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * arrangements Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added whitelist grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * word grammar for non-classified items Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated cardinal, decimal, time, itn data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to last PR Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to the updates for cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for more Mandarin punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR. removing am pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * adjustment on the weight Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the targger updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the time tagger Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to changes in tagger on am and pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * verbalizer for fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for mandarin grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * kept this file because using English utils results in data namin error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * merge conflict Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed import os Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed variables Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue, reccreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue recreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed codeing style/format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed coding style and format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicated graph for 毛 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unnecessary comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * unnecessary comment removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test file updated for more cases Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated with a comment explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated the file explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added Mandarin as zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing for dplication Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused NEMO objects Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix test file failures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix file failtures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failture Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adap to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adapt to grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adopt to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixing pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed // for zhtn/itn cache Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Update inverse_normalize.py Added zh as a selection to pass Jenkins checks. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated pynini_export.py file to create far files (#88) Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * readd Swedish (#87) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn 0712 (#89) * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and fixings according to document on natonal gideline Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fraction updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * money updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * punctuation grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * time gramamr updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * tokenizaer updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates on certificate Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * data updated and added due to updates and chanegs to the existing grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * cardinal updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * date grammar changed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test data added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test python file edits Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for tn1.0 and previous tn grammar from contribution Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test cases updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fixed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * dates updated for init files Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated the date for zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added back the itn tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added back measure and math from previou TN Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated for tests reruns Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated weights Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn char (#95) * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code stle Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * audio-based TN fix for empty pred_text/text (#92) * fix for empty pred_text Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unittests Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix pytest Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * pip 1.2.0 Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * French tn (#91) * add tests for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add fr tn for cardinals, decimals, fractions and ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * delete it far files from tools Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add languages to run_evaluate Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * remove ambiguous spacing Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * enable sh testing for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile cache date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix test for ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update tn cache for fr Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * resolve codeql issues Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Add whitelist_tech.tsv (#96) Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zhitn 0727 (#93) * updates on itn grammar to pass sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats for sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates fro sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fix Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for coding style and sparrowhawk test Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated classes for tests on whitelist and word grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for tests on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_word.py Removed unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Removed imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Removing imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile changed zh cache to 07-27-23 as it is the latest update. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Es tn romans fix (#98) * fix es tn roman exceptions Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update eval script for ITN Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * codeql fix Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Change docker image (#102) Change docker image to one including sparrowhawk Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Print warning instead exception (#97) * raise text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * text arg Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * Failed text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * add logger Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * logger Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * NeMo-text-processing Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * info level Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Normalizer.select_verbalizer Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Exception Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * restart ci Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * warning regardless of verbose flag (#107) * warning Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * self.verbose Signed-off-by: Nikolay Karpov <karpnv@gmail.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Unpin setuptools (#106) Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixed warnings: File is not always closes. (#113) Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix bug #111 (ar currencies) (#117) * fix bug #111 (ar currencies) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci folder Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Logging clean up + IT TN fix (#118) * fix utils and it TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix logging Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * add IT TN to CI Signed-off-by: Evelina <ebakhturina@nvidia.com> * update patch Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Time_IT_TN (#105) * add time verbalizer Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add time tagger and verba Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add pytest time Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeQL Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix numbers with eight Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * rewrote tokenizer Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed the file and replaced it with char in 1.8 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins file update Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * to fix tn bug@ xuesong Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * tn bug Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixeds and updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adjustments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * testing commit Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating etst cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates adapting to graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated cases for SH tests Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated cases Signed-off- 10000 by: Alex Cui <alexcui1994@gmail.com> * added some sentences Signed-off-by: Alex Cui <alexcui1994@gmail.com> * test cases update Signed-off-by: Alex Cui <alexcui1994@gmail.com> * solving rebase issue, repushing changes Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving conflict Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixings according to ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixings according to the ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * notused removing Signed-off-by: Alex Cui <alexcui1994@gmail.com> * format issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * formt issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing unused files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unused files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remiving unsed files; Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added sentences as test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added senetnces as test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed commentyed out tests Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating dates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * attemps to fix bug Signed-off-by: Alex Cui <alexcui1994@gmail.com> * inprocess of fixing the bug Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixing existing issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated graph_utils, tokenize and classify, and word graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added bacl the ppostprocessor far creation Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated NEMO_NOT_ALPHA as a new variable Signed-off-by: Alex Cui <alexcui1994@gmail.com> * far files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * combiedn into measure Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing and combined to meaasure Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing, not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to solve the space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving sh issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving sh test issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adding anands updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * data updated for measure and whitelist Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing fraction and math part Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing preprocessor, updating measure, adding shitelist cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing processor, modification for sp test, shitelist and word Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating zh date Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * realized itn being cvommented out, adding back Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * trying to run zh tn separately because it takes long time to run Signed-off-by: Alex Cui <alexcui1994@gmail.com> * modification to ru zh tn separately Signed-off-by: Alex Cui <alexcui1994@gmail.com> * independent zh tnitn tests for more time Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adding lines to save far file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates for reducing testing time Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * for ounct graph Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing used graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * format and removing used comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing this one, not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remove unused commentss� Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Delete tools/text_processing_deployment/zh directory Removing far files. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * updates according to the github comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * punct grammar Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_cases_cardinal.txt Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Dockerfile Copied from main branch ( which included Anand's updates) Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update launch.sh Found differences in the file. Fixing it back. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Saw word ITN being commented out. Adding it back. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update money.py Found cardinal grammar not accepting suffix. Fixed it. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update Jenkinsfile Removed duplicated zh test from line 230s Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update utils.py Addressing bug raised in bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update graph_utils.py Addressing bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update measure.py Fixing code style, removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Fixing code style, removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update measure.py Removing unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update post_processing.py Removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update post_processing.py Removing unused import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update cardinal.py Deleting unused graph Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Removing import pynini Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py removing pynini import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update verbalize.py removing pynutil import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update post_processing.py removing punct graph imported Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_sparrowhawk_normalization.sh Update on test issue for Docker file locations Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_ordinal.py Fixing style. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Delete nemo_text_processing/text_normalization/zh/taggers/math_symbol.py Removing because it's not one of the semiotic classes. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Delete nemo_text_processing/text_normalization/zh/verbalizers/math_symbol.py Removing because it's not one of the semiotic classes. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile Updating Jenkins date Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Jim O’Regan <joregan@kth.se> Co-authored-by: Enno Hermann <Eginhard@users.noreply.github.com> Co-authored-by: Vitaly Lavrukhin <vitaly.lavrukhin@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Enas Albasiri <71229149+ealbasiri@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: lleaver <137942999+lleaver@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Jim O’Regan <jaoregan@tcd.ie> Co-authored-by: Giacomo Leone Maria Cavallini <72698188+GiacomoLeoneMaria@users.noreply.github.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: Nikolay Karpov <karpnv@gmail.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Peter Plantinga <plantinga.peter@proton.me> Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
BuyuanCui
added a commit
that referenced
this pull request
Aug 20, 2024
* IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix broken path for nondet whitelist (#124) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Increase weights for serial (en TN) (#128) * Increase weights for serial (en TN) Resolves https://github.com/NVIDIA/NeMo-text-processing/issues/126 Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Add tests for fix Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile cache path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile. Fix cache folder Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measures file for FR TN (#131) * add measures file Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update whitelist data Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * add fr tn tests Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh jenkins (#127) * Add SH tests to Jenkins Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkins tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add CI/CD tests for sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * docker build only if in test mode Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing variable Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix comments and remove arguments not required Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix commands not executing Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing arguments Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing quotes Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix incorrect path for tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Incorrect paths of tests and shunit2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix issues with paths as arguments to shunit Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Undo path change Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix intentional fail test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * revert redundant check for cased option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix default path in export_grammars.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add interactive option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add SH tests for cased EN ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update isort - fix precommit (#138) * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused imports Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian itn (#136) * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context for tests and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Revert "Added context for tests and fixed CodeQL errors" This reverts commit 2c804d941963c0be21d3aad07e6cd13568ab747b. Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context to some test files and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unnecessary data Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * translated a few measurements to Armenian Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * adjusted some things for better readability and maintainer support Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed one test case and some issues Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix CI (#142) * fix whitelist deployment Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * comment out tests to recreate grammars Signed-off-by: Evelina <ebakhturina@nvidia.com> * shorten test Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix jenkins Signed-off-by: Evelina <ebakhturina@nvidia.com> * cased for TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * revert debug changes Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix args default Signed-off-by: Evelina <ebakhturina@nvidia.com> * try parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * debug parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix sh tests for local SH launcher Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian TN (#137) * merged with main branch and fixed conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing some more conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixed a minor issue Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unused imports Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix: add "hy" language option for armenian Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> * added optional space for measurements after cardinals/decimals Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * added Armenian dot Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Marathi ITN (#134) * Added Marathi ITN Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adding jenkins test Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Travis Bartley <tbartley@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins fix (#150) * jenkins fix Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * missing _init_ for python Signed-off-by: Travis Bartley <tbartley@nvidia.com> * mislabled cache Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * r0.3.0 release (#151) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix text=line[text] to text=line[text_field] (#153) Signed-off-by: Sasha Meister <sasha.meister.work@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * use real string on docstring (#157) Signed-off-by: Kevin Sanders <kevin.sanders@dialpad.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh postprocess (#147) * Add support for postprocessor far in sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Choose between having a post processor or not Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update run_evaluate script for cased itn (#164) * update run_evaluate script for cased itn Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remove unused function from ar tn decimals (#165) * remove unused function from ar tn decimals Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * ZH sentence-level TN (#112) * Swedish telephone fix (#60) * port fix for telephone from swedish-itn branch Signed-off-by: Jim O'Regan <joregan@kth.se> * extend cardinal in non-deterministic mode Signed-off-by: Jim O'Regan <joregan@kth.se> * whitespace fixes Signed-off-by: Jim O'Regan <joregan@kth.se> * also fix in the verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * Update Jenkinsfile Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * log instead of print in graph_utils.py (#68) Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * CER estimation speedup for audio-based text normalization (#73) * Replaced jiwer with editdistance to speed up CER estimation Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measure coverage for TN and ITN (#62) * add measure coverage for TN and ITN Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update measure.py Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63) * upload es-ES and fr-FR g2p dicts Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add inits Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add NALA Spanish dict Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * rename Spanish and French dictionaries Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add Italian dictionary Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add country codes from hu (#77) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix electronic case for username (#75) * fix electronic username w/o . Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix ar test Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci dirs, enable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * 0.1.8 release (#79) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Codeswitched ES/EN ITN (#78) * Initial commit for ES-EN codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable export for es_en codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add whitelist, update weights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add tests for en_es, zone tagged separately in es Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix path to test data for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkinsfile - enable ES/EN tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add __init__.py files Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix issues with failed docker build - due to archiving of debian and issues with re2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove unused imports and variables Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update date Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable NBSP in sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update copyrights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache path in for ES/EN CI/CD Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * minor normalize.py edit for usability (#84) * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * documentation edits for grammar/clarity Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * added --output_field flag for command line interface Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Swedish ITN (#40) * force two digits for month Signed-off-by: Jim O'Regan <joregan@kth.se> * put it in a function, because I reject the garbage pre-commit.ci came up with Signed-off-by: Jim O'Regan <joregan@kth.se> * wrap some more pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * add graph pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * delete junk Signed-off-by: Jim O'Regan <joregan@kth.se> * my copyright Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser (copy from es) Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * add right tokens Signed-off-by: Jim O'Regan <joregan@kth.se> * some tweaks, more needed Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to ITN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * moved to tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * nothing actually fixed here Signed-off-by: Jim O'Regan <joregan@kth.se> * now most tests pass Signed-off-by: Jim O'Regan <joregan@kth.se> * electronic Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fractions Signed-off-by: Jim O'Regan <joregan@kth.se> * extend Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bare fractions is a bit of an overreach Signed-off-by: Jim O'Regan <joregan@kth.se> * whitelist Signed-off-by: Jim O'Regan <joregan@kth.se> * just inverting the TN whitelist tagger will not work/be useful Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from English Signed-off-by: Jim O'Regan <joregan@kth.se> * overwrite with version from en Signed-off-by: Jim O'Regan <joregan@kth.se> * add basic test case Signed-off-by: Jim O'Regan <joregan@kth.se> * fix call Signed-off-by: Jim O'Regan <joregan@kth.se> * swap tsv sides Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add optional_era variable Signed-off-by: Jim O'Regan <joregan@kth.se> * add test case Signed-off-by: Jim O'Regan <joregan@kth.se> * make deterministic default, like most of the others Signed-off-by: Jim O'Regan <joregan@kth.se> * also add lowercase versions Signed-off-by: Jim O'Regan <joregan@kth.se> * replacing NEMO_SPACE does not work either Signed-off-by: Jim O'Regan <joregan@kth.se> * increasing weight... did not work last time Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaking test cases, in case it was a sentence splitting issue. It was not Signed-off-by: Jim O'Regan <joregan@kth.se> * put the full stops back Signed-off-by: Jim O'Regan <joregan@kth.se> * add filler words Signed-off-by: Jim O'Regan <joregan@kth.se> * try splitting this out to see if it makes a difference Signed-off-by: Jim O'Regan <joregan@kth.se> * aha, this part should be non-deterministic only Signed-off-by: Jim O'Regan <joregan@kth.se> * single line only Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "increasing weight... did not work last time" This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996. Signed-off-by: Jim O'Regan <joregan@kth.se> * disabling ITN here makes TN work again(?) Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "disabling ITN here makes TN work again(?)" This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f. Signed-off-by: Jim O'Regan <joregan@kth.se> * changing the variable name fixes norm tests Signed-off-by: Jim O'Regan <joregan@kth.se> * change the variable names Signed-off-by: Jim O'Regan <joregan@kth.se> * add missing test tooling Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * add a piece for area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add country codes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * extend any_read_digit for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * country/area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * first attempt Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Sign 10000 ed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * remove country codes for the time being, makes things ambiguous Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove trailing whitespace Signed-off-by: Jim O'Regan <joregan@kth.se> * Update __init__.py Signed-off-by: Jim O’Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transform of TN tests Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transformation of TN decimal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * slight changes to date Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * include space Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen was not that Signed-off-by: Jim O'Regan <joregan@kth.se> * add functions from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * respect my own copyright xD Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading, this has been an oddity before Signed-off-by: Jim O'Regan <joregan@kth.se> * try changing this year declaration Signed-off-by: Jim O'Regan <joregan@kth.se> * add year + era Signed-off-by: Jim O'Regan <joregan@kth.se> * eliminate more module-level data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "eliminate more module-level data loading" This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a. Signed-off-by: Jim O'Regan <joregan@kth.se> * expose variables Signed-off-by: Jim O'Regan <joregan@kth.se> * extra param for itn mode Signed-off-by: Jim O'Regan <joregan@kth.se> * change call Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * fix parens Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapt/extend tests Signed-off-by: Jim O'Regan <joregan@kth.se> * fix dict init/change keys to something useful Signed-off-by: Jim O'Regan <joregan@kth.se> * initial stab at prefixed numbers Signed-off-by: Jim O'Regan <joregan@kth.se> * some adapting Signed-off-by: Jim O'Regan <joregan@kth.se> * insert kl. if absent Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comments Signed-off-by: Jim O'Regan <joregan@kth.se> * the relative prefixed times Signed-off-by: Jim O'Regan <joregan@kth.se> * + comments Signed-off-by: Jim O'Regan <joregan@kth.se> * enable time Signed-off-by: Jim O'Regan <joregan@kth.se> * space in both directions Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix hours to Signed-off-by: Jim O'Regan <joregan@kth.se> * split by before/after Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * fix if Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. 9 Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from en Signed-off-by: Jim O'Regan <joregan@kth.se> * keep only get_abs_path Signed-off-by: Jim O'Regan <joregan@kth.se> * imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add trimmed file Signed-off-by: Jim O'Regan <joregan@kth.se> * fix imports Signed-off-by: Jim O'Regan <joregan@kth.se> * two abs_paths... could be fun Signed-off-by: Jim O'Regan <joregan@kth.se> * minutes/seconds Signed-off-by: Jim O'Regan <joregan@kth.se> * suffix Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * one optional Signed-off-by: Jim O'Regan <joregan@kth.se> * export variable Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. or one of suffix/zone Signed-off-by: Jim O'Regan <joregan@kth.se> * already disambiguated Signed-off-by: Jim O'Regan <joregan@kth.se> * closure Signed-off-by: Jim O'Regan <joregan@kth.se> * do not insert kl. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix test case Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix spelling Signed-off-by: Jim O'Regan <joregan@kth.se> * Delete measure.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Delete money.py Signed-off-by: Jim O’Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused test pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from es Signed-off-by: Jim O'Regan <joregan@kth.se> * add SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add/update __init__ Signed-off-by: Jim O'Regan <joregan@kth.se> * blank line Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix lang Signed-off-by: Jim O'Regan <joregan@kth.se> * fix decimal verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * remove year, conflicts with cardinal Signed-off-by: Jim O'Regan <joregan@kth.se> * space before, not after Signed-off-by: Jim O'Regan <joregan@kth.se> * fix cardinal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * spurious deletion Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * re-enable SV TN; enable SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "re-enable SV TN; enable SV ITN" This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix singulras Signed-off-by: Jim O'Regan <joregan@kth.se> * add an export Signed-off-by: Jim O'Regan <joregan@kth.se> * change integer graph Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move spaces Signed-off-by: Jim O'Regan <joregan@kth.se> * use cdrewrite Signed-off-by: Jim O'Regan <joregan@kth.se> * just EOS/BOS Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <joregan@kth.se> * omit en/ett, because they are also articles Signed-off-by: Jim O'Regan <joregan@kth.se> * uncomment Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused Signed-off-by: Jim O'Regan <joregan@kth.se> * strip spaces from decimal part Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * partial fix, not what I wanted Signed-off-by: Jim O'Regan <joregan@kth.se> * move comment Signed-off-by: Jim O'Regan <joregan@kth.se> * en/ett cannot work in itn case Signed-off-by: Jim O'Regan <joregan@kth.se> * be more deliberate in graph construction Signed-off-by: Jim O'Regan <joregan@kth.se> * accept both Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * +2 tests Signed-off-by: Jim O'Regan <joregan@kth.se> * (try to) accept singular quantities for plurals Signed-off-by: Jim O'Regan <joregan@kth.se> * retry Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * oops Signed-off-by: Jim O'Regan <joregan@kth.se> * replace Signed-off-by: Jim O'Regan <joregan@kth.se> * arcmap Signed-off-by: Jim O'Regan <joregan@kth.se> * version without ones Signed-off-by: Jim O'Regan <joregan@kth.se> * add another test Signed-off-by: Jim O'Regan <joregan@kth.se> * change graph Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of this, this is where it goes wrong Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * add a test Signed-off-by: Jim O'Regan <joregan@kth.se> * multiple states from both ones, try removing and readding Signed-off-by: Jim O'Regan <joregan@kth.se> * remove ones, see if that fixes at least the bare quantities Signed-off-by: Jim O'Regan <joregan@kth.se> * works in the repl, dunno why it still breaks Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * move definition Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * another test Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * local declaration, seems to not be working Signed-off-by: Jim O'Regan <joregan@kth.se> * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * match verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix last two failing tests Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing tests for telephone and word Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused variable Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of convert_space, tests fail Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "put convert_spaces back, change test file; pytest fails" This reverts commit a7bb7489137b8026aab02aff64df39e874630043. Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails, take 2 Signed-off-by: Jim O'Regan <joregan@kth.se> * deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk Signed-off-by: Jim O'Regan <joregan@kth.se> * try converting the non-breaking spaces in the shell script Signed-off-by: Jim O'Regan <joregan@kth.se> * wrong place Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * fix path Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused Signed-off-by: Jim O'Regan <joregan@kth.se> * Update date.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Update time.py Signed-off-by: Jim O’Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix comment Signed-off-by: Jim O’Regan <joregan@kth.se> * trim comments Signed-off-by: Jim O’Regan <joregan@kth.se> * remove commented line Signed-off-by: Jim O’Regan <joregan@kth.se> * en halv Signed-off-by: Jim O’Regan <joregan@kth.se> * Update test_sparrowhawk_inverse_text_normalization.sh Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Italian_TN (#67) * add TN italian Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix init Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix LOCATION Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * modify graph_utils Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * correct decimals Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix measure Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh itn (#74) * Add ZH ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix copyrights and code cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove invalid tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve CodeQL issues Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing 'zh' option for ITN and correct comment Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update __init__.py Change to zh instead of en for the imports. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for decimal test data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * update for langauge import Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for Chinese punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * a new class for whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * PYNINI_AVAILABLE = False Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to file import format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * caught duplicates, removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates, arranges for CHInese Yuan updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates accordingly to the comments from last PR. Recreated some of the files due to format issues Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * re-added this file to avoid data file import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated gramamr according to last PR. Removed the acceptance of 千 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated according to last PR. Removed comma after decimal points Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for Fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for money and updated according to last PR. Plus process of 元 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar. updates due to the updates in cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * arrangements Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added whitelist grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * word grammar for non-classified items Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated cardinal, decimal, time, itn data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to last PR Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to the updates for cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for more Mandarin punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR. removing am pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * adjustment on the weight Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the targger updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the time tagger Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to changes in tagger on am and pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * verbalizer for fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for mandarin grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * kept this file because using English utils results in data namin error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * merge conflict Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed import os Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed variables Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue, reccreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue recreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed codeing style/format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed coding style and format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicated graph for 毛 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unnecessary comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * unnecessary comment removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test file updated for more cases Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated with a comment explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated the file explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added Mandarin as zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing for dplication Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused NEMO objects Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix test file failures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix file failtures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failture Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adap to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adapt to grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adopt to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixing pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed // for zhtn/itn cache Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Update inverse_normalize.py Added zh as a selection to pass Jenkins checks. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated pynini_export.py file to create far files (#88) Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * readd Swedish (#87) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn 0712 (#89) * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and fixings according to document on natonal gideline Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fraction updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * money updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * punctuation grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * time gramamr updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * tokenizaer updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates on certificate Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * data updated and added due to updates and chanegs to the existing grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * cardinal updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * date grammar changed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test data added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test python file edits Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for tn1.0 and previous tn grammar from contribution Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test cases updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fixed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * dates updated for init files Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated the date for zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added back the itn tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added back measure and math from previou TN Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated for tests reruns Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated weights Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn char (#95) * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code stle Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * audio-based TN fix for empty pred_text/text (#92) * fix for empty pred_text Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unittests Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix pytest Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * pip 1.2.0 Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * French tn (#91) * add tests for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add fr tn for cardinals, decimals, fractions and ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * delete it far files from tools Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add languages to run_evaluate Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * remove ambiguous spacing Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * enable sh testing for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile cache date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix test for ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update tn cache for fr Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * resolve codeql issues Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Add whitelist_tech.tsv (#96) Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zhitn 0727 (#93) * updates on itn grammar to pass sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats for sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates fro sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fix Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for coding style and sparrowhawk test Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated classes for tests on whitelist and word grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for tests on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_word.py Removed unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Removed imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Removing imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile changed zh cache to 07-27-23 as it is the latest update. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Es tn romans fix (#98) * fix es tn roman exceptions Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update eval script for ITN Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * codeql fix Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Change docker image (#102) Change docker image to one including sparrowhawk Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Print warning instead exception (#97) * raise text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * text arg Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * Failed text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * add logger Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * logger Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * NeMo-text-processing Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * info level Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Normalizer.select_verbalizer Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Exception Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * restart ci Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * warning regardless of verbose flag (#107) * warning Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * self.verbose Signed-off-by: Nikolay Karpov <karpnv@gmail.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Unpin setuptools (#106) Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixed warnings: File is not always closes. (#113) Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix bug #111 (ar currencies) (#117) * fix bug #111 (ar currencies) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci folder Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Logging clean up + IT TN fix (#118) * fix utils and it TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix logging Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * add IT TN to CI Signed-off-by: Evelina <ebakhturina@nvidia.com> * update patch Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Time_IT_TN (#105) * add time verbalizer Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add time tagger and verba Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add pytest time Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeQL Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix numbers with eight Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * rewrote tokenizer Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed the file and replaced it with char in 1.8 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins file update Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * to fix tn bug@ xuesong Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * tn bug Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixeds and updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adjustments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * testing commit Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating etst cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates adapting to graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated …
BuyuanCui
pushed a commit
that referenced
this pull request
Sep 19, 2024
Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
BuyuanCui
added a commit
that referenced
this pull request
Sep 19, 2024
* Swedish telephone fix (#60) * port fix for telephone from swedish-itn branch Signed-off-by: Jim O'Regan <joregan@kth.se> * extend cardinal in non-deterministic mode Signed-off-by: Jim O'Regan <joregan@kth.se> * whitespace fixes Signed-off-by: Jim O'Regan <joregan@kth.se> * also fix in the verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * Update Jenkinsfile Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * log instead of print in F438 graph_utils.py (#68) Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * CER estimation speedup for audio-based text normalization (#73) * Replaced jiwer with editdistance to speed up CER estimation Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measure coverage for TN and ITN (#62) * add measure coverage for TN and ITN Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update measure.py Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63) * upload es-ES and fr-FR g2p dicts Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add inits Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add NALA Spanish dict Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * rename Spanish and French dictionaries Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add Italian dictionary Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add country codes from hu (#77) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix electronic case for username (#75) * fix electronic username w/o . Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix ar test Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci dirs, enable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * 0.1.8 release (#79) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Codeswitched ES/EN ITN (#78) * Initial commit for ES-EN codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable export for es_en codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add whitelist, update weights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add tests for en_es, zone tagged separately in es Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix path to test data for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkinsfile - enable ES/EN tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add __init__.py files Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix issues with failed docker build - due to archiving of debian and issues with re2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove unused imports and variables Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update date Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable NBSP in sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update copyrights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache path in for ES/EN CI/CD Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * minor normalize.py edit for usability (#84) * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * documentation edits for grammar/clarity Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * added --output_field flag for command line interface Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Swedish ITN (#40) * force two digits for month Signed-off-by: Jim O'Regan <joregan@kth.se> * put it in a function, because I reject the garbage pre-commit.ci came up with Signed-off-by: Jim O'Regan <joregan@kth.se> * wrap some more pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * add graph pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * delete junk Signed-off-by: Jim O'Regan <joregan@kth.se> * my copyright Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser (copy from es) Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * add right tokens Signed-off-by: Jim O'Regan <joregan@kth.se> * some tweaks, more needed Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to ITN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * moved to tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * nothing actually fixed here Signed-off-by: Jim O'Regan <joregan@kth.se> * now most tests pass Signed-off-by: Jim O'Regan <joregan@kth.se> * electronic Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fractions Signed-off-by: Jim O'Regan <joregan@kth.se> * extend Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bare fractions is a bit of an overreach Signed-off-by: Jim O'Regan <joregan@kth.se> * whitelist Signed-off-by: Jim O'Regan <joregan@kth.se> * just inverting the TN whitelist tagger will not work/be useful Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from English Signed-off-by: Jim O'Regan <joregan@kth.se> * overwrite with version from en Signed-off-by: Jim O'Regan <joregan@kth.se> * add basic test case Signed-off-by: Jim O'Regan <joregan@kth.se> * fix call Signed-off-by: Jim O'Regan <joregan@kth.se> * swap tsv sides Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add optional_era variable Signed-off-by: Jim O'Regan <joregan@kth.se> * add test case Signed-off-by: Jim O'Regan <joregan@kth.se> * make deterministic default, like most of the others Signed-off-by: Jim O'Regan <joregan@kth.se> * also add lowercase versions Signed-off-by: Jim O'Regan <joregan@kth.se> * replacing NEMO_SPACE does not work either Signed-off-by: Jim O'Regan <joregan@kth.se> * increasing weight... did not work last time Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaking test cases, in case it was a sentence splitting issue. It was not Signed-off-by: Jim O'Regan <joregan@kth.se> * put the full stops back Signed-off-by: Jim O'Regan <joregan@kth.se> * add filler words Signed-off-by: Jim O'Regan <joregan@kth.se> * try splitting this out to see if it makes a difference Signed-off-by: Jim O'Regan <joregan@kth.se> * aha, this part should be non-deterministic only Signed-off-by: Jim O'Regan <joregan@kth.se> * single line only Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "increasing weight... did not work last time" This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996. Signed-off-by: Jim O'Regan <joregan@kth.se> * disabling ITN here makes TN work again(?) Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "disabling ITN here makes TN work again(?)" This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f. Signed-off-by: Jim O'Regan <joregan@kth.se> * changing the variable name fixes norm tests Signed-off-by: Jim O'Regan <joregan@kth.se> * change the variable names Signed-off-by: Jim O'Regan <joregan@kth.se> * add missing test tooling Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * add a piece for area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add country codes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * extend any_read_digit for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * country/area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * first attempt Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * remove country codes for the time being, makes things ambiguous Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove trailing whitespace Signed-off-by: Jim O'Regan <joregan@kth.se> * Update __init__.py Signed-off-by: Jim O’Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transform of TN tests Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transformation of TN decimal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * slight changes to date Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * include space Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen was not that Signed-off-by: Jim O'Regan <joregan@kth.se> * add functions from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * respect my own copyright xD Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading, this has been an oddity before Signed-off-by: Jim O'Regan <joregan@kth.se> * try changing this year declaration Signed-off-by: Jim O'Regan <joregan@kth.se> * add year + era Signed-off-by: Jim O'Regan <joregan@kth.se> * eliminate more module-level data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "eliminate more module-level data loading" This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a. Signed-off-by: Jim O'Regan <joregan@kth.se> * expose variables Signed-off-by: Jim O'Regan <joregan@kth.se> * extra param for itn mode Signed-off-by: Jim O'Regan <joregan@kth.se> * change call Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * fix parens Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapt/extend tests Signed-off-by: Jim O'Regan <joregan@kth.se> * fix dict init/change keys to something useful Signed-off-by: Jim O'Regan <joregan@kth.se> * initial stab at prefixed numbers Signed-off-by: Jim O'Regan <joregan@kth.se> * some adapting Signed-off-by: Jim O'Regan <joregan@kth.se> * insert kl. if absent Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comments Signed-off-by: Jim O'Regan <joregan@kth.se> * the relative prefixed times Signed-off-by: Jim O'Regan <joregan@kth.se> * + comments Signed-off-by: Jim O'Regan <joregan@kth.se> * enable time Signed-off-by: Jim O'Regan <joregan@kth.se> * space in both directions Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix hours to Signed-off-by: Jim O'Regan <joregan@kth.se> * split by before/after Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * fix if Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. 9 Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from en Signed-off-by: Jim O'Regan <joregan@kth.se> * keep only get_abs_path Signed-off-by: Jim O'Regan <joregan@kth.se> * imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add trimmed file Signed-off-by: Jim O'Regan <joregan@kth.se> * fix imports Signed-off-by: Jim O'Regan <joregan@kth.se> * two abs_paths... could be fun Signed-off-by: Jim O'Regan <joregan@kth.se> * minutes/seconds Signed-off-by: Jim O'Regan <joregan@kth.se> * suffix Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * one optional Signed-off-by: Jim O'Regan <joregan@kth.se> * export variable Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. or one of suffix/zone Signed-off-by: Jim O'Regan <joregan@kth.se> * already disambiguated Signed-off-by: Jim O'Regan <joregan@kth.se> * closure Signed-off-by: Jim O'Regan <joregan@kth.se> * do not insert kl. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix test case Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix spelling Signed-off-by: Jim O'Regan <joregan@kth.se> * Delete measure.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Delete money.py Signed-off-by: Jim O’Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused test pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from es Signed-off-by: Jim O'Regan <joregan@kth.se> * add SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add/update __init__ Signed-off-by: Jim O'Regan <joregan@kth.se> * blank line Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix lang Signed-off-by: Jim O'Regan <joregan@kth.se> * fix decimal verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * remove year, conflicts with cardinal Signed-off-by: Jim O'Regan <joregan@kth.se> * space before, not after Signed-off-by: Jim O'Regan <joregan@kth.se> * fix cardinal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * spurious deletion Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * re-enable SV TN; enable SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "re-enable SV TN; enable SV ITN" This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix singulras Signed-off-by: Jim O'Regan <joregan@kth.se> * add an export Signed-off-by: Jim O'Regan <joregan@kth.se> * change integer graph Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move spaces Signed-off-by: Jim O'Regan <joregan@kth.se> * use cdrewrite Signed-off-by: Jim O'Regan <joregan@kth.se> * just EOS/BOS Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <joregan@kth.se> * omit en/ett, because they are also articles Signed-off-by: Jim O'Regan <joregan@kth.se> * uncomment Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused Signed-off-by: Jim O'Regan <joregan@kth.se> * strip spaces from decimal part Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * partial fix, not what I wanted Signed-off-by: Jim O'Regan <joregan@kth.se> * move comment Signed-off-by: Jim O'Regan <joregan@kth.se> * en/ett cannot work in itn case Signed-off-by: Jim O'Regan <joregan@kth.se> * be more deliberate in graph construction Signed-off-by: Jim O'Regan <joregan@kth.se> * accept both Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * +2 tests Signed-off-by: Jim O'Regan <joregan@kth.se> * (try to) accept singular quantities for plurals Signed-off-by: Jim O'Regan <joregan@kth.se> * retry Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * oops Signed-off-by: Jim O'Regan <joregan@kth.se> * replace Signed-off-by: Jim O'Regan <joregan@kth.se> * arcmap Signed-off-by: Jim O'Regan <joregan@kth.se> * version without ones Signed-off-by: Jim O'Regan <joregan@kth.se> * add another test Signed-off-by: Jim O'Regan <joregan@kth.se> * change graph Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of this, this is where it goes wrong Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * add a test Signed-off-by: Jim O'Regan <joregan@kth.se> * multiple states from both ones, try removing and readding Signed-off-by: Jim O'Regan <joregan@kth.se> * remove ones, see if that fixes at least the bare quantities Signed-off-by: Jim O'Regan <joregan@kth.se> * works in the repl, dunno why it still breaks Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * move definition Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * another test Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * local declaration, seems to not be working Signed-off-by: Jim O'Regan <joregan@kth.se> * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * match verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix last two failing tests Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing tests for telephone and word Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused variable Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of convert_space, tests fail Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "put convert_spaces back, change test file; pytest fails" This reverts commit a7bb7489137b8026aab02aff64df39e874630043. Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails, take 2 Signed-off-by: Jim O'Regan <joregan@kth.se> * deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk Signed-off-by: Jim O'Regan <joregan@kth.se> * try converting the non-breaking spaces in the shell script Signed-off-by: Jim O'Regan <joregan@kth.se> * wrong place Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * fix path Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused Signed-off-by: Jim O'Regan <joregan@kth.se> * Update date.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Update time.py Signed-off-by: Jim O’Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix comment Signed-off-by: Jim O’Regan <joregan@kth.se> * trim comments Signed-off-by: Jim O’Regan <joregan@kth.se> * remove commented line Signed-off-by: Jim O’Regan <joregan@kth.se> * en halv Signed-off-by: Jim O’Regan <joregan@kth.se> * Update test_sparrowhawk_inverse_text_normalization.sh Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Italian_TN (#67) * add TN italian Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix init Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix LOCATION Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * modify graph_utils Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * correct decimals Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix measure Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh itn (#74) * Add ZH ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix copyrights and code cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove invalid tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve CodeQL issues Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing 'zh' option for ITN and correct comment Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update __init__.py Change to zh instead of en for the imports. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for decimal test data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * update for langauge import Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for Chinese punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * a new class for whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * PYNINI_AVAILABLE = False Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to file import format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * caught duplicates, removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates, arranges for CHInese Yuan updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates accordingly to the comments from last PR. Recreated some of the files due to format issues Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * re-added this file to avoid data file import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated gramamr according to last PR. Removed the acceptance of 千 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated according to last PR. Removed comma after decimal points Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for Fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for money and updated according to last PR. Plus process of 元 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar. updates due to the updates in cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * arrangements Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added whitelist grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * word grammar for non-classified items Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated cardinal, decimal, time, itn data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to last PR Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to the updates for cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for more Mandarin punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR. removing am pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * adjustment on the weight Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the targger updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the time tagger Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to changes in tagger on am and pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * verbalizer for fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for mandarin grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * kept this file because using English utils results in data namin error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * merge conflict Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed import os Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed variables Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue, reccreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue recreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed codeing style/format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed coding style and format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicated graph for 毛 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unnecessary comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * unnecessary comment removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test file updated for more cases Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated with a comment explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated the file explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added Mandarin as zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing for dplication Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused NEMO objects Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix test file failures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix file failtures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failture Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adap to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adapt to grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adopt to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixing pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed // for zhtn/itn cache Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Update inverse_normalize.py Added zh as a selection to pass Jenkins checks. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated pynini_export.py file to create far files (#88) Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * readd Swedish (#87) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn 0712 (#89) * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and fixings according to document on natonal gideline Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fraction updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * money updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * punctuation grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * time gramamr updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * tokenizaer updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates on certificate Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * data updated and added due to updates and chanegs to the existing grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * cardinal updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * date grammar changed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test data added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test python file edits Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for tn1.0 and previous tn grammar from contribution Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test cases updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fixed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * dates updated for init files Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated the date for zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added back the itn tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added back measure and math from previou TN Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated for tests reruns Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated weights Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn char (#95) * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code stle Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * audio-based TN fix for empty pred_text/text (#92) * fix for empty pred_text Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unittests Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix pytest Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * pip 1.2.0 Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * French tn (#91) * add tests for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add fr tn for cardinals, decimals, fractions and ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * delete it far files from tools Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add languages to run_evaluate Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * remove ambiguous spacing Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * enable sh testing for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile cache date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix test for ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update tn cache for fr Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * resolve codeql issues Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Add whitelist_tech.tsv (#96) Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zhitn 0727 (#93) * updates on itn grammar to pass sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats for sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates fro sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fix Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for coding style and sparrowhawk test Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated classes for tests on whitelist and word grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for tests on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_word.py Removed unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Removed imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Removing imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile changed zh cache to 07-27-23 as it is the latest update. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Es tn romans fix (#98) * fix es tn roman exceptions Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update eval script for ITN Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * codeql fix Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Change docker image (#102) Change docker image to one including sparrowhawk Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Print warning instead exception (#97) * raise text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * text arg Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * Failed text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * add logger Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * logger Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * NeMo-text-processing Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * info level Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Normalizer.select_verbalizer Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Exception Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * restart ci Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * warning regardless of verbose flag (#107) * warning Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * self.verbose Signed-off-by: Nikolay Karpov <karpnv@gmail.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Unpin setuptools (#106) Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixed warnings: File is not always closes. (#113) Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix bug #111 (ar currencies) (#117) * fix bug #111 (ar currencies) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci folder Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Logging clean up + IT TN fix (#118) * fix utils and it TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix logging Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * add IT TN to CI Signed-off-by: Evelina <ebakhturina@nvidia.com> * update patch Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Time_IT_TN (#105) * add time verbalizer Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add time tagger and verba Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add pytest time Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeQL Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix numbers with eight Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * rewrote tokenizer Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed the file and replaced it with char in 1.8 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins file update Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * to fix tn bug@ xuesong Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * tn bug Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixeds and updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adjustments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * testing commit Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating etst cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates adapting to graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated cases for SH tests Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added some sentences Signed-off-by: Alex Cui <alexcui1994@gmail.com> * test cases update Signed-off-by: Alex Cui <alexcui1994@gmail.com> * solving rebase issue, repushing changes Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving conflict Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixings according to ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixings according to the ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * notused removing Signed-off-by: Alex Cui <alexcui1994@gmail.com> * format issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * formt issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing unused files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unused files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remiving unsed files; Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added sentences as test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added senetnces as test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed commentyed out tests Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating dates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * attemps to fix bug Signed-off-by: Alex Cui <alexcui1994@gmail.com> * inprocess of fixing the bug Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixing existing issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated graph_utils, tokenize and classify, and word graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added bacl the ppostprocessor far creation Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated NEMO_NOT_ALPHA as a new variable Signed-off-by: Alex Cui <alexcui1994@gmail.com> * far files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * combiedn into measure Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing and combined to meaasure Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing, not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to solve the space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving sh issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving sh test issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adding anands updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * data updated for measure and whitelist Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing fraction and math part Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing preprocessor, updating measure, adding shitelist cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing processor, modification for sp test, shitelist and word Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating zh date Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * realized itn being cvommented out, adding back Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * trying to run zh tn separately because it takes long time to run Signed-off-by: Alex Cui <alexcui1994@gmail.com> * modification to ru zh tn separately Signed-off-by: Alex Cui <alexcui1994@gmail.com> * independent zh tnitn tests for more time Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adding lines to save far file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates for reducing testing time Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * for ounct graph Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing used graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * format and removing used comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing this one, not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remove unused commentss� Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Delete tools/text_processing_deployment/zh directory Removing far files. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * updates according to the github comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * punct grammar Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_cases_cardinal.txt Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Dockerfile Copied from main branch ( which included Anand's updates) Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update launch.sh Found differences in the file. Fixing it back. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Saw word ITN being commented out. Adding it back. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.g 10000 ithub.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update money.py Found cardinal grammar not accepting suffix. Fixed it. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update Jenkinsfile Removed duplicated zh test from line 230s Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update utils.py Addressing bug raised in bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update graph_utils.py Addressing bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update measure.py Fixing code style, removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Fixing code style, removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update measure.py Removing unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update post_processing.py Removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update post_processing.py Removing unused import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update cardinal.py Deleting unused graph Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Removing import pynini Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py removing pynini import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update verbalize.py removing pynutil import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update post_processing.py removing punct graph imported Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_sparrowhawk_normalization.sh Update on test issue for Docker file locations Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_ordinal.py Fixing style. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Delete nemo_text_processing/text_normalization/zh/taggers/math_symbol.py Removing because it's not one of the semiotic classes. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Delete nemo_text_processing/text_normalization/zh/verbalizers/math_symbol.py Removing because it's not one of the semiotic classes. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile Updating Jenkins date Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Jim O’Regan <joregan@kth.se> Co-authored-by: Enno Hermann <Eginhard@users.noreply.github.com> Co-authored-by: Vitaly Lavrukhin <vitaly.lavrukhin@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Enas Albasiri <71229149+ealbasiri@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: lleaver <137942999+lleaver@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Jim O’Regan <jaoregan@tcd.ie> Co-authored-by: Giacomo Leone Maria Cavallini <72698188+GiacomoLeoneMaria@users.noreply.github.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: Nikolay Karpov <karpnv@gmail.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Peter Plantinga <plantinga.peter@proton.me> Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
BuyuanCui
added a commit
that referenced
this pull request
Sep 19, 2024
* IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix broken path for nondet whitelist (#124) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Increase weights for serial (en TN) (#128) * Increase weights for serial (en TN) Resolves https://github.com/NVIDIA/NeMo-text-processing/issues/126 Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Add tests for fix Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile cache path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile. Fix cache folder Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measures file for FR TN (#131) * add measures file Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update whitelist data Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * add fr tn tests Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh jenkins (#127) * Add SH tests to Jenkins Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkins tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add CI/CD tests for sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * docker build only if in test mode Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing variable Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix comments and remove arguments not required Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix commands not executing Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing arguments Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing quotes Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix incorrect path for tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Incorrect paths of tests and shunit2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix issues with paths as arguments to shunit Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Undo path change Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix intentional fail test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * revert redundant check for cased option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix default path in export_grammars.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add interactive option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add SH tests for cased EN ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update isort - fix precommit (#138) * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused imports Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian itn (#136) * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context for tests and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Revert "Added context for tests and fixed CodeQL errors" This reverts commit 2c804d941963c0be21d3aad07e6cd13568ab747b. Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context to some test files and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unnecessary data Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * translated a few measurements to Armenian Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * adjusted some things for better readability and maintainer support Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed one test case and some issues Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix CI (#142) * fix whitelist deployment Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * comment out tests to recreate grammars Signed-off-by: Evelina <ebakhturina@nvidia.com> * shorten test Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix jenkins Signed-off-by: Evelina <ebakhturina@nvidia.com> * cased for TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * revert debug changes Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix args default Signed-off-by: Evelina <ebakhturina@nvidia.com> * try parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * debug parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix sh tests for local SH launcher Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian TN (#137) * merged with main branch and fixed conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing some more conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixed a minor issue Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unused imports Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix: add "hy" language option for armenian Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> * added optional space for measurements after cardinals/decimals Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * added Armenian dot Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Marathi ITN (#134) * Added Marathi ITN Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adding jenkins test Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Travis Bartley <tbartley@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins fix (#150) * jenkins fix Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * missing _init_ for python Signed-off-by: Travis Bartley <tbartley@nvidia.com> * mislabled cache Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * r0.3.0 release (#151) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix text=line[text] to text=line[text_field] (#153) Signed-off-by: Sasha Meister <sasha.meister.work@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * use real string on docstring (#157) Signed-off-by: Kevin Sanders <kevin.sanders@dialpad.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh postprocess (#147) * Add support for postprocessor far in sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Choose between having a post processor or not Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update run_evaluate script for cased itn (#164) * update run_evaluate script for cased itn Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remove unused function from ar tn decimals (#165) * remove unused function from ar tn decimals Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * ZH sentence-level TN (#112) * Swedish telephone fix (#60) * port fix for telephone from swedish-itn branch Signed-off-by: Jim O'Regan <joregan@kth.se> * extend cardinal in non-deterministic mode Signed-off-by: Jim O'Regan <joregan@kth.se> * whitespace fixes Signed-off-by: Jim O'Regan <joregan@kth.se> * also fix in the verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * Update Jenkinsfile Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * log instead of print in graph_utils.py (#68) Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * CER estimation speedup for audio-based text normalization (#73) * Replaced jiwer with editdistance to speed up CER estimation Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measure coverage for TN and ITN (#62) * add measure coverage for TN and ITN Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update measure.py Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63) * upload es-ES and fr-FR g2p dicts Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add inits Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add NALA Spanish dict Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * rename Spanish and French dictionaries Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add Italian dictionary Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add country codes from hu (#77) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix electronic case for username (#75) * fix electronic username w/o . Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix ar test Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci dirs, enable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * 0.1.8 release (#79) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Codeswitched ES/EN ITN (#78) * Initial commit for ES-EN codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable export for es_en codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add whitelist, update weights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add tests for en_es, zone tagged separately in es Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix path to test data for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkinsfile - enable ES/EN tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add __init__.py files Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix issues with failed docker build - due to archiving of debian and issues with re2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove unused imports and variables Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update date Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable NBSP in sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update copyrights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache path in for ES/EN CI/CD Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * minor normalize.py edit for usability (#84) * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * documentation edits for grammar/clarity Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * added --output_field flag for command line interface Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Swedish ITN (#40) * force two digits for month Signed-off-by: Jim O'Regan <joregan@kth.se> * put it in a function, because I reject the garbage pre-commit.ci came up with Signed-off-by: Jim O'Regan <joregan@kth.se> * wrap some more pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * add graph pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * delete junk Signed-off-by: Jim O'Regan <joregan@kth.se> * my copyright Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser (copy from es) Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * add right tokens Signed-off-by: Jim O'Regan <joregan@kth.se> * some tweaks, more needed Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to ITN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * moved to tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * nothing actually fixed here Signed-off-by: Jim O'Regan <joregan@kth.se> * now most tests pass Signed-off-by: Jim O'Regan <joregan@kth.se> * electronic Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fractions Signed-off-by: Jim O'Regan <joregan@kth.se> * extend Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bare fractions is a bit of an overreach Signed-off-by: Jim O'Regan <joregan@kth.se> * whitelist Signed-off-by: Jim O'Regan <joregan@kth.se> * just inverting the TN whitelist tagger will not work/be useful Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from English Signed-off-by: Jim O'Regan <joregan@kth.se> * overwrite with version from en Signed-off-by: Jim O'Regan <joregan@kth.se> * add basic test case Signed-off-by: Jim O'Regan <joregan@kth.se> * fix call Signed-off-by: Jim O'Regan <joregan@kth.se> * swap tsv sides Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add optional_era variable Signed-off-by: Jim O'Regan <joregan@kth.se> * add test case Signed-off-by: Jim O'Regan <joregan@kth.se> * make deterministic default, like most of the others Signed-off-by: Jim O'Regan <joregan@kth.se> * also add lowercase versions Signed-off-by: Jim O'Regan <joregan@kth.se> * replacing NEMO_SPACE does not work either Signed-off-by: Jim O'Regan <joregan@kth.se> * increasing weight... did not work last time Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaking test cases, in case it was a sentence splitting issue. It was not Signed-off-by: Jim O'Regan <joregan@kth.se> * put the full stops back Signed-off-by: Jim O'Regan <joregan@kth.se> * add filler words Signed-off-by: Jim O'Regan <joregan@kth.se> * try splitting this out to see if it makes a difference Signed-off-by: Jim O'Regan <joregan@kth.se> * aha, this part should be non-deterministic only Signed-off-by: Jim O'Regan <joregan@kth.se> * single line only Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "increasing weight... did not work last time" This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996. Signed-off-by: Jim O'Regan <joregan@kth.se> * disabling ITN here makes TN work again(?) Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "disabling ITN here makes TN work again(?)" This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f. Signed-off-by: Jim O'Regan <joregan@kth.se> * changing the variable name fixes norm tests Signed-off-by: Jim O'Regan <joregan@kth.se> * change the variable names Signed-off-by: Jim O'Regan <joregan@kth.se> * add missing test tooling Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * add a piece for area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add country codes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * extend any_read_digit for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * country/area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * first attempt Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * remove country codes for the time being, makes things ambiguous Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove trailing whitespace Signed-off-by: Jim O'Regan <joregan@kth.se> * Update __init__.py Signed-off-by: Jim O’Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transform of TN tests Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transformation of TN decimal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * slight changes to date Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * include space Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen was not that Signed-off-by: Jim O'Regan <joregan@kth.se> * add functions from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * respect my own copyright xD Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading, this has been an oddity before Signed-off-by: Jim O'Regan <joregan@kth.se> * try changing this year declaration Signed-off-by: Jim O'Regan <joregan@kth.se> * add year + era Signed-off-by: Jim O'Regan <joregan@kth.se> * eliminate more module-level data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "eliminate more module-level data loading" This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a. Signed-off-by: Jim O'Regan <joregan@kth.se> * expose variables Signed-off-by: Jim O'Regan <joregan@kth.se> * extra param for itn mode Signed-off-by: Jim O'Regan <joregan@kth.se> * change call Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * fix parens Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapt/extend tests Signed-off-by: Jim O'Regan <joregan@kth.se> * fix dict init/change keys to something useful Signed-off-by: Jim O'Regan <joregan@kth.se> * initial stab at prefixed numbers Signed-off-by: Jim O'Regan <joregan@kth.se> * some adapting Signed-off-by: Jim O'Regan <joregan@kth.se> * insert kl. if absent Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comments Signed-off-by: Jim O'Regan <joregan@kth.se> * the relative prefixed times Signed-off-by: Jim O'Regan <joregan@kth.se> * + comments Signed-off-by: Jim O'Regan <joregan@kth.se> * enable time Signed-off-by: Jim O'Regan <joregan@kth.se> * space in both directions Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix hours to Signed-off-by: Jim O'Regan <joregan@kth.se> * split by before/after Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * fix if Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. 9 Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from en Signed-off-by: Jim O'Regan <joregan@kth.se> * keep only get_abs_path Signed-off-by: Jim O'Regan <joregan@kth.se> * imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add trimmed file Signed-off-by: Jim O'Regan <joregan@kth.se> * fix imports Signed-off-by: Jim O'Regan <joregan@kth.se> * two abs_paths... could be fun Signed-off-by: Jim O'Regan <joregan@kth.se> * minutes/seconds Signed-off-by: Jim O'Regan <joregan@kth.se> * suffix Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * one optional Signed-off-by: Jim O'Regan <joregan@kth.se> * export variable Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. or one of suffix/zone Signed-off-by: Jim O'Regan <joregan@kth.se> * already disambiguated Signed-off-by: Jim O'Regan <joregan@kth.se> * closure Signed-off-by: Jim O'Regan <joregan@kth.se> * do not insert kl. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix test case Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix spelling Signed-off-by: Jim O'Regan <joregan@kth.se> * Delete measure.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Delete money.py Signed-off-by: Jim O’Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused test pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from es Signed-off-by: Jim O'Regan <joregan@kth.se> * add SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add/update __init__ Signed-off-by: Jim O'Regan <joregan@kth.se> * blank line Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix lang Signed-off-by: Jim O'Regan <joregan@kth.se> * fix decimal verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * remove year, conflicts with cardinal Signed-off-by: Jim O'Regan <joregan@kth.se> * space before, not after Signed-off-by: Jim O'Regan <joregan@kth.se> * fix cardinal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * spurious deletion Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * re-enable SV TN; enable SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "re-enable SV TN; enable SV ITN" Th 10000 is reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix singulras Signed-off-by: Jim O'Regan <joregan@kth.se> * add an export Signed-off-by: Jim O'Regan <joregan@kth.se> * change integer graph Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move spaces Signed-off-by: Jim O'Regan <joregan@kth.se> * use cdrewrite Signed-off-by: Jim O'Regan <joregan@kth.se> * just EOS/BOS Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <joregan@kth.se> * omit en/ett, because they are also articles Signed-off-by: Jim O'Regan <joregan@kth.se> * uncomment Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused Signed-off-by: Jim O'Regan <joregan@kth.se> * strip spaces from decimal part Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * partial fix, not what I wanted Signed-off-by: Jim O'Regan <joregan@kth.se> * move comment Signed-off-by: Jim O'Regan <joregan@kth.se> * en/ett cannot work in itn case Signed-off-by: Jim O'Regan <joregan@kth.se> * be more deliberate in graph construction Signed-off-by: Jim O'Regan <joregan@kth.se> * accept both Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * +2 tests Signed-off-by: Jim O'Regan <joregan@kth.se> * (try to) accept singular quantities for plurals Signed-off-by: Jim O'Regan <joregan@kth.se> * retry Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * oops Signed-off-by: Jim O'Regan <joregan@kth.se> * replace Signed-off-by: Jim O'Regan <joregan@kth.se> * arcmap Signed-off-by: Jim O'Regan <joregan@kth.se> * version without ones Signed-off-by: Jim O'Regan <joregan@kth.se> * add another test Signed-off-by: Jim O'Regan <joregan@kth.se> * change graph Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of this, this is where it goes wrong Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * add a test Signed-off-by: Jim O'Regan <joregan@kth.se> * multiple states from both ones, try removing and readding Signed-off-by: Jim O'Regan <joregan@kth.se> * remove ones, see if that fixes at least the bare quantities Signed-off-by: Jim O'Regan <joregan@kth.se> * works in the repl, dunno why it still breaks Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * move definition Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * another test Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * local declaration, seems to not be working Signed-off-by: Jim O'Regan <joregan@kth.se> * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * match verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix last two failing tests Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing tests for telephone and word Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused variable Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of convert_space, tests fail Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "put convert_spaces back, change test file; pytest fails" This reverts commit a7bb7489137b8026aab02aff64df39e874630043. Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails, take 2 Signed-off-by: Jim O'Regan <joregan@kth.se> * deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk Signed-off-by: Jim O'Regan <joregan@kth.se> * try converting the non-breaking spaces in the shell script Signed-off-by: Jim O'Regan <joregan@kth.se> * wrong place Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * fix path Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused Signed-off-by: Jim O'Regan <joregan@kth.se> * Update date.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Update time.py Signed-off-by: Jim O’Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix comment Signed-off-by: Jim O’Regan <joregan@kth.se> * trim comments Signed-off-by: Jim O’Regan <joregan@kth.se> * remove commented line Signed-off-by: Jim O’Regan <joregan@kth.se> * en halv Signed-off-by: Jim O’Regan <joregan@kth.se> * Update test_sparrowhawk_inverse_text_normalization.sh Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Italian_TN (#67) * add TN italian Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix init Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix LOCATION Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * modify graph_utils Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * correct decimals Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix measure Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh itn (#74) * Add ZH ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix copyrights and code cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove invalid tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve CodeQL issues Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing 'zh' option for ITN and correct comment Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update __init__.py Change to zh instead of en for the imports. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for decimal test data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * update for langauge import Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for Chinese punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * a new class for whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * PYNINI_AVAILABLE = False Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to file import format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * caught duplicates, removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates, arranges for CHInese Yuan updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates accordingly to the comments from last PR. Recreated some of the files due to format issues Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * re-added this file to avoid data file import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated gramamr according to last PR. Removed the acceptance of 千 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated according to last PR. Removed comma after decimal points Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for Fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for money and updated according to last PR. Plus process of 元 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar. updates due to the updates in cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * arrangements Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added whitelist grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * word grammar for non-classified items Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated cardinal, decimal, time, itn data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to last PR Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to the updates for cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for more Mandarin punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR. removing am pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * adjustment on the weight Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the targger updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the time tagger Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to changes in tagger on am and pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * verbalizer for fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for mandarin grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * kept this file because using English utils results in data namin error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * merge conflict Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed import os Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed variables Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue, reccreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue recreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed codeing style/format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed coding style and format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicated graph for 毛 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unnecessary comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * unnecessary comment removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test file updated for more cases Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated with a comment explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated the file explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added Mandarin as zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing for dplication Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused NEMO objects Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix test file failures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix file failtures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failture Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adap to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adapt to grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adopt to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixing pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed // for zhtn/itn cache Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Update inverse_normalize.py Added zh as a selection to pass Jenkins checks. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated pynini_export.py file to create far files (#88) Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * readd Swedish (#87) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn 0712 (#89) * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and fixings according to document on natonal gideline Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fraction updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * money updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * punctuation grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * time gramamr updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * tokenizaer updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates on certificate Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * data updated and added due to updates and chanegs to the existing grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * cardinal updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * date grammar changed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test data added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test python file edits Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for tn1.0 and previous tn grammar from contribution Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test cases updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fixed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * dates updated for init files Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated the date for zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added back the itn tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added back measure and math from previou TN Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated for tests reruns Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated weights Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn char (#95) * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code stle Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * audio-based TN fix for empty pred_text/text (#92) * fix for empty pred_text Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unittests Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix pytest Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * pip 1.2.0 Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * French tn (#91) * add tests for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add fr tn for cardinals, decimals, fractions and ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * delete it far files from tools Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add languages to run_evaluate Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * remove ambiguous spacing Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * enable sh testing for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile cache date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix test for ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update tn cache for fr Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * resolve codeql issues Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Add whitelist_tech.tsv (#96) Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zhitn 0727 (#93) * updates on itn grammar to pass sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats for sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates fro sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fix Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for coding style and sparrowhawk test Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated classes for tests on whitelist and word grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for tests on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_word.py Removed unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Removed imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Removing imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile changed zh cache to 07-27-23 as it is the latest update. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Es tn romans fix (#98) * fix es tn roman exceptions Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update eval script for ITN Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * codeql fix Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Change docker image (#102) Change docker image to one including sparrowhawk Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Print warning instead exception (#97) * raise text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * text arg Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * Failed text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * add logger Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * logger Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * NeMo-text-processing Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * info level Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Normalizer.select_verbalizer Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Exception Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * restart ci Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * warning regardless of verbose flag (#107) * warning Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * self.verbose Signed-off-by: Nikolay Karpov <karpnv@gmail.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Unpin setuptools (#106) Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixed warnings: File is not always closes. (#113) Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix bug #111 (ar currencies) (#117) * fix bug #111 (ar currencies) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci folder Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Logging clean up + IT TN fix (#118) * fix utils and it TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix logging Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * add IT TN to CI Signed-off-by: Evelina <ebakhturina@nvidia.com> * update patch Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Time_IT_TN (#105) * add time verbalizer Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add time tagger and verba Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add pytest time Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeQL Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix numbers with eight Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * rewrote tokenizer Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed the file and replaced it with char in 1.8 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins file update Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * to fix tn bug@ xuesong Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * tn bug Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixeds and updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adjustments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * testing commit Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating etst cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates adapting to graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated …
BuyuanCui
pushed a commit
that referenced
this pull request
Sep 26, 2024
Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
BuyuanCui
added a commit
that referenced
this pull request
Sep 26, 2024
* Swedish telephone fix (#60) * port fix for telephone from swedish-itn branch Signed-off-by: Jim O'Regan <joregan@kth.se> * extend cardinal in non-deterministic mode Signed-off-by: Jim O'Regan <joregan@kth.se> * whitespace fixes Signed-off-by: Jim O'Regan <joregan@kth.se> * also fix in the verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * Update Jenkinsfile Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * log instead of print in graph_utils.py (#68) Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * CER estimation speedup for audio-based text normalization (#73) * Replaced jiwer with editdistance to speed up CER estimation Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measure coverage for TN and ITN (#62) * add measure coverage for TN and ITN Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update measure.py Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63) * upload es-ES and fr-FR g2p dicts Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add inits Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add NALA Spanish dict Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * rename Spanish and French dictionaries Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add Italian dictionary Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add country codes from hu (#77) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix electronic case for username (#75) * fix electronic username w/o . Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix ar test Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci dirs, enable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * 0.1.8 release (#79) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Codeswitched ES/EN ITN (#78) * Initial commit for ES-EN codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable export for es_en codeswitched ITN Signed-off-by: Anand Joseph &l 10000 t;anajoseph@nvidia.com> * Add whitelist, update weights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add tests for en_es, zone tagged separately in es Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix path to test data for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkinsfile - enable ES/EN tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add __init__.py files Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix issues with failed docker build - due to archiving of debian and issues with re2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove unused imports and variables Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update date Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable NBSP in sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update copyrights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache path in for ES/EN CI/CD Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * minor normalize.py edit for usability (#84) * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * documentation edits for grammar/clarity Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * added --output_field flag for command line interface Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Swedish ITN (#40) * force two digits for month Signed-off-by: Jim O'Regan <joregan@kth.se> * put it in a function, because I reject the garbage pre-commit.ci came up with Signed-off-by: Jim O'Regan <joregan@kth.se> * wrap some more pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * add graph pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * delete junk Signed-off-by: Jim O'Regan <joregan@kth.se> * my copyright Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser (copy from es) Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * add right tokens Signed-off-by: Jim O'Regan <joregan@kth.se> * some tweaks, more needed Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to ITN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * moved to tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * nothing actually fixed here Signed-off-by: Jim O'Regan <joregan@kth.se> * now most tests pass Signed-off-by: Jim O'Regan <joregan@kth.se> * electronic Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fractions Signed-off-by: Jim O'Regan <joregan@kth.se> * extend Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bare fractions is a bit of an overreach Signed-off-by: Jim O'Regan <joregan@kth.se> * whitelist Signed-off-by: Jim O'Regan <joregan@kth.se> * just inverting the TN whitelist tagger will not work/be useful Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from English Signed-off-by: Jim O'Regan <joregan@kth.se> * overwrite with version from en Signed-off-by: Jim O'Regan <joregan@kth.se> * add basic test case Signed-off-by: Jim O'Regan <joregan@kth.se> * fix call Signed-off-by: Jim O'Regan <joregan@kth.se> * swap tsv sides Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add optional_era variable Signed-off-by: Jim O'Regan <joregan@kth.se> * add test case Signed-off-by: Jim O'Regan <joregan@kth.se> * make deterministic default, like most of the others Signed-off-by: Jim O'Regan <joregan@kth.se> * also add lowercase versions Signed-off-by: Jim O'Regan <joregan@kth.se> * replacing NEMO_SPACE does not work either Signed-off-by: Jim O'Regan <joregan@kth.se> * increasing weight... did not work last time Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaking test cases, in case it was a sentence splitting issue. It was not Signed-off-by: Jim O'Regan <joregan@kth.se> * put the full stops back Signed-off-by: Jim O'Regan <joregan@kth.se> * add filler words Signed-off-by: Jim O'Regan <joregan@kth.se> * try splitting this out to see if it makes a difference Signed-off-by: Jim O'Regan <joregan@kth.se> * aha, this part should be non-deterministic only Signed-off-by: Jim O'Regan <joregan@kth.se> * single line only Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "increasing weight... did not work last time" This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996. Signed-off-by: Jim O'Regan <joregan@kth.se> * disabling ITN here makes TN work again(?) Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "disabling ITN here makes TN work again(?)" This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f. Signed-off-by: Jim O'Regan <joregan@kth.se> * changing the variable name fixes norm tests Signed-off-by: Jim O'Regan <joregan@kth.se> * change the variable names Signed-off-by: Jim O'Regan <joregan@kth.se> * add missing test tooling Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * add a piece for area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add country codes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * extend any_read_digit for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * country/area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * first attempt Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * remove country codes for the time being, makes things ambiguous Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove trailing whitespace Signed-off-by: Jim O'Regan <joregan@kth.se> * Update __init__.py Signed-off-by: Jim O’Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transform of TN tests Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transformation of TN decimal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * slight changes to date Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * include space Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen was not that Signed-off-by: Jim O'Regan <joregan@kth.se> * add functions from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * respect my own copyright xD Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading, this has been an oddity before Signed-off-by: Jim O'Regan <joregan@kth.se> * try changing this year declaration Signed-off-by: Jim O'Regan <joregan@kth.se> * add year + era Signed-off-by: Jim O'Regan <joregan@kth.se> * eliminate more module-level data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "eliminate more module-level data loading" This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a. Signed-off-by: Jim O'Regan <joregan@kth.se> * expose variables Signed-off-by: Jim O'Regan <joregan@kth.se> * extra param for itn mode Signed-off-by: Jim O'Regan <joregan@kth.se> * change call Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * fix parens Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapt/extend tests Signed-off-by: Jim O'Regan <joregan@kth.se> * fix dict init/change keys to something useful Signed-off-by: Jim O'Regan <joregan@kth.se> * initial stab at prefixed numbers Signed-off-by: Jim O'Regan <joregan@kth.se> * some adapting Signed-off-by: Jim O'Regan <joregan@kth.se> * insert kl. if absent Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comments Signed-off-by: Jim O'Regan <joregan@kth.se> * the relative prefixed times Signed-off-by: Jim O'Regan <joregan@kth.se> * + comments Signed-off-by: Jim O'Regan <joregan@kth.se> * enable time Signed-off-by: Jim O'Regan <joregan@kth.se> * space in both directions Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix hours to Signed-off-by: Jim O'Regan <joregan@kth.se> * split by before/after Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * fix if Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. 9 Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from en Signed-off-by: Jim O'Regan <joregan@kth.se> * keep only get_abs_path Signed-off-by: Jim O'Regan <joregan@kth.se> * imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add trimmed file Signed-off-by: Jim O'Regan <joregan@kth.se> * fix imports Signed-off-by: Jim O'Regan <joregan@kth.se> * two abs_paths... could be fun Signed-off-by: Jim O'Regan <joregan@kth.se> * minutes/seconds Signed-off-by: Jim O'Regan <joregan@kth.se> * suffix Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * one optional Signed-off-by: Jim O'Regan <joregan@kth.se> * export variable Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. or one of suffix/zone Signed-off-by: Jim O'Regan <joregan@kth.se> * already disambiguated Signed-off-by: Jim O'Regan <joregan@kth.se> * closure Signed-off-by: Jim O'Regan <joregan@kth.se> * do not insert kl. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix test case Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix spelling Signed-off-by: Jim O'Regan <joregan@kth.se> * Delete measure.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Delete money.py Signed-off-by: Jim O’Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused test pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from es Signed-off-by: Jim O'Regan <joregan@kth.se> * add SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add/update __init__ Signed-off-by: Jim O'Regan <joregan@kth.se> * blank line Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix lang Signed-off-by: Jim O'Regan <joregan@kth.se> * fix decimal verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * remove year, conflicts with cardinal Signed-off-by: Jim O'Regan <joregan@kth.se> * space before, not after Signed-off-by: Jim O'Regan <joregan@kth.se> * fix cardinal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * spurious deletion Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * re-enable SV TN; enable SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "re-enable SV TN; enable SV ITN" This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix singulras Signed-off-by: Jim O'Regan <joregan@kth.se> * add an export Signed-off-by: Jim O'Regan <joregan@kth.se> * change integer graph Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move spaces Signed-off-by: Jim O'Regan <joregan@kth.se> * use cdrewrite Signed-off-by: Jim O'Regan <joregan@kth.se> * just EOS/BOS Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <joregan@kth.se> * omit en/ett, because they are also articles Signed-off-by: Jim O'Regan <joregan@kth.se> * uncomment Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused Signed-off-by: Jim O'Regan <joregan@kth.se> * strip spaces from decimal part Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * partial fix, not what I wanted Signed-off-by: Jim O'Regan <joregan@kth.se> * move comment Signed-off-by: Jim O'Regan <joregan@kth.se> * en/ett cannot work in itn case Signed-off-by: Jim O'Regan <joregan@kth.se> * be more deliberate in graph construction Signed-off-by: Jim O'Regan <joregan@kth.se> * accept both Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * +2 tests Signed-off-by: Jim O'Regan <joregan@kth.se> * (try to) accept singular quantities for plurals Signed-off-by: Jim O'Regan <joregan@kth.se> * retry Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * oops Signed-off-by: Jim O'Regan <joregan@kth.se> * replace Signed-off-by: Jim O'Regan <joregan@kth.se> * arcmap Signed-off-by: Jim O'Regan <joregan@kth.se> * version without ones Signed-off-by: Jim O'Regan <joregan@kth.se> * add another test Signed-off-by: Jim O'Regan <joregan@kth.se> * change graph Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of this, this is where it goes wrong Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * add a test Signed-off-by: Jim O'Regan <joregan@kth.se> * multiple states from both ones, try removing and readding Signed-off-by: Jim O'Regan <joregan@kth.se> * remove ones, see if that fixes at least the bare quantities Signed-off-by: Jim O'Regan <joregan@kth.se> * works in the repl, dunno why it still breaks Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * move definition Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * another test Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * local declaration, seems to not be working Signed-off-by: Jim O'Regan <joregan@kth.se> * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * match verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix last two failing tests Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing tests for telephone and word Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused variable Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of convert_space, tests fail Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "put convert_spaces back, change test file; pytest fails" This reverts commit a7bb7489137b8026aab02aff64df39e874630043. Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails, take 2 Signed-off-by: Jim O'Regan <joregan@kth.se> * deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk Signed-off-by: Jim O'Regan <joregan@kth.se> * try converting the non-breaking spaces in the shell script Signed-off-by: Jim O'Regan <joregan@kth.se> * wrong place Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * fix path Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused Signed-off-by: Jim O'Regan <joregan@kth.se> * Update date.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Update time.py Signed-off-by: Jim O’Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix comment Signed-off-by: Jim O’Regan <joregan@kth.se> * trim comments Signed-off-by: Jim O’Regan <joregan@kth.se> * remove commented line Signed-off-by: Jim O’Regan <joregan@kth.se> * en halv Signed-off-by: Jim O’Regan <joregan@kth.se> * Update test_sparrowhawk_inverse_text_normalization.sh Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Italian_TN (#67) * add TN italian Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix init Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix LOCATION Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * modify graph_utils Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * correct decimals Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix measure Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh itn (#74) * Add ZH ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix copyrights and code cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove invalid tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve CodeQL issues Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing 'zh' option for ITN and correct comment Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update __init__.py Change to zh instead of en for the imports. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for decimal test data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * update for langauge import Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for Chinese punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * a new class for whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * PYNINI_AVAILABLE = False Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to file import format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * caught duplicates, removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates, arranges for CHInese Yuan updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates accordingly to the comments from last PR. Recreated some of the files due to format issues Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * re-added this file to avoid data file import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated gramamr according to last PR. Removed the acceptance of 千 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated according to last PR. Removed comma after decimal points Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for Fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for money and updated according to last PR. Plus process of 元 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar. updates due to the updates in cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * arrangements Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added whitelist grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * word grammar for non-classified items Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated cardinal, decimal, time, itn data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to last PR Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to the updates for cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for more Mandarin punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR. removing am pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * adjustment on the weight Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the targger updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the time tagger Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to changes in tagger on am and pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * verbalizer for fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for mandarin grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * kept this file because using English utils results in data namin error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * merge conflict Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed import os Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed variables Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue, reccreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue recreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed codeing style/format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed coding style and format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicated graph for 毛 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unnecessary comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * unnecessary comment removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test file updated for more cases Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated with a comment explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated the file explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added Mandarin as zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing for dplication Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused NEMO objects Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix test file failures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix file failtures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failture Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adap to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adapt to grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adopt to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixing pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed // for zhtn/itn cache Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Update inverse_normalize.py Added zh as a selection to pass Jenkins checks. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated pynini_export.py file to create far files (#88) Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * readd Swedish (#87) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn 0712 (#89) * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and fixings according to document on natonal gideline Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fraction updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * money updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * punctuation grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * time gramamr updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * tokenizaer updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates on certificate Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * data updated and added due to updates and chanegs to the existing grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * cardinal updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * date grammar changed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test data added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test python file edits Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for tn1.0 and previous tn grammar from contribution Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test cases updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fixed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * dates updated for init files Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated the date for zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added back the itn tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added back measure and math from previou TN Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated for tests reruns Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated weights Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn char (#95) * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code stle Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * audio-based TN fix for empty pred_text/text (#92) * fix for empty pred_text Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unittests Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix pytest Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * pip 1.2.0 Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * French tn (#91) * add tests for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add fr tn for cardinals, decimals, fractions and ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * delete it far files from tools Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add languages to run_evaluate Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * remove ambiguous spacing Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * enable sh testing for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile cache date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix test for ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update tn cache for fr Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * resolve codeql issues Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Add whitelist_tech.tsv (#96) Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zhitn 0727 (#93) * updates on itn grammar to pass sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats for sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates fro sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fix Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for coding style and sparrowhawk test Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated classes for tests on whitelist and word grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for tests on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_word.py Removed unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Removed imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Removing imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile changed zh cache to 07-27-23 as it is the latest update. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Es tn romans fix (#98) * fix es tn roman exceptions Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update eval script for ITN Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * codeql fix Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Change docker image (#102) Change docker image to one including sparrowhawk Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Print warning instead exception (#97) * raise text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * text arg Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * Failed text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * add logger Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * logger Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * NeMo-text-processing Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * info level Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Normalizer.select_verbalizer Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Exception Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * restart ci Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * warning regardless of verbose flag (#107) * warning Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * self.verbose Signed-off-by: Nikolay Karpov <karpnv@gmail.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Unpin setuptools (#106) Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixed warnings: File is not always closes. (#113) Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix bug #111 (ar currencies) (#117) * fix bug #111 (ar currencies) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci folder Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Logging clean up + IT TN fix (#118) * fix utils and it TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix logging Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * add IT TN to CI Signed-off-by: Evelina <ebakhturina@nvidia.com> * update patch Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Time_IT_TN (#105) * add time verbalizer Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add time tagger and verba Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add pytest time Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeQL Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix numbers with eight Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * rewrote tokenizer Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed the file and replaced it with char in 1.8 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins file update Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * to fix tn bug@ xuesong Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * tn bug Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixeds and updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adjustments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * testing commit Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating etst cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates adapting to graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated cases for SH tests Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added some sentences Signed-off-by: Alex Cui <alexcui1994@gmail.com> * test cases update Signed-off-by: Alex Cui <alexcui1994@gmail.com> * solving rebase issue, repushing changes Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving conflict Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixings according to ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixings according to the ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * notused removing Signed-off-by: Alex Cui <alexcui1994@gmail.com> * format issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * formt issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing unused files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unused files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remiving unsed files; Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added sentences as test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added senetnces as test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed commentyed out tests Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating dates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * attemps to fix bug Signed-off-by: Alex Cui <alexcui1994@gmail.com> * inprocess of fixing the bug Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixing existing issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated graph_utils, tokenize and classify, and word graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added bacl the ppostprocessor far creation Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated NEMO_NOT_ALPHA as a new variable Signed-off-by: Alex Cui <alexcui1994@gmail.com> * far files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * combiedn into measure Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing and combined to meaasure Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing, not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to solve the space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving sh issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving sh test issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adding anands updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * data updated for measure and whitelist Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing fraction and math part Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing preprocessor, updating measure, adding shitelist cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing processor, modification for sp test, shitelist and word Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating zh date Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * realized itn being cvommented out, adding back Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * trying to run zh tn separately because it takes long time to run Signed-off-by: Alex Cui <alexcui1994@gmail.com> * modification to ru zh tn separately Signed-off-by: Alex Cui <alexcui1994@gmail.com> * independent zh tnitn tests for more time Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adding lines to save far file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates for reducing testing time Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * for ounct graph Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing used graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * format and removing used comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing this one, not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remove unused commentss� Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Delete tools/text_processing_deployment/zh directory Removing far files. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * updates according to the github comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * punct grammar Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_cases_cardinal.txt Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Dockerfile Copied from main branch ( which included Anand's updates) Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update launch.sh Found differences in the file. Fixing it back. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Saw word ITN being commented out. Adding it back. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update money.py Found cardinal grammar not accepting suffix. Fixed it. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update Jenkinsfile Removed duplicated zh test from line 230s Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update utils.py Addressing bug raised in bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update graph_utils.py Addressing bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update measure.py Fixing code style, removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Fixing code style, removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update measure.py Removing unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update post_processing.py Removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update post_processing.py Removing unused import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update cardinal.py Deleting unused graph Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Removing import pynini Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py removing pynini import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update verbalize.py removing pynutil import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update post_processing.py removing punct graph imported Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_sparrowhawk_normalization.sh Update on test issue for Docker file locations Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_ordinal.py Fixing style. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Delete nemo_text_processing/text_normalization/zh/taggers/math_symbol.py Removing because it's not one of the semiotic classes. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Delete nemo_text_processing/text_normalization/zh/verbalizers/math_symbol.py Removing because it's not one of the semiotic classes. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile Updating Jenkins date Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Jim O’Regan <joregan@kth.se> Co-authored-by: Enno Hermann <Eginhard@users.noreply.github.com> Co-authored-by: Vitaly Lavrukhin <vitaly.lavrukhin@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Enas Albasiri <71229149+ealbasiri@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: lleaver <137942999+lleaver@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Jim O’Regan <jaoregan@tcd.ie> Co-authored-by: Giacomo Leone Maria Cavallini <72698188+GiacomoLeoneMaria@users.noreply.github.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: Nikolay Karpov <karpnv@gmail.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Peter Plantinga <plantinga.peter@proton.me> Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
BuyuanCui
added a commit
that referenced
this pull request
Sep 26, 2024
* IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix broken path for nondet whitelist (#124) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Increase weights for serial (en TN) (#128) * Increase weights for serial (en TN) Resolves https://github.com/NVIDIA/NeMo-text-processing/issues/126 Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Add tests for fix Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile cache path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile. Fix cache folder Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measures file for FR TN (#131) * add measures file Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update whitelist data Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * add fr tn tests Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh jenkins (#127) * Add SH tests to Jenkins Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkins tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add CI/CD tests for sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * docker build only if in test mode Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing variable Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix comments and remove arguments not required Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix commands not executing Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing arguments Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing quotes Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix incorrect path for tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Incorrect paths of tests and shunit2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix issues with paths as arguments to shunit Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Undo path change Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix intentional fail test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * revert redundant check for cased option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix default path in export_grammars.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add interactive option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add SH tests for cased EN ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update isort - fix precommit (#138) * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused imports Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian itn (#136) * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context for tests and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Revert "Added context for tests and fixed CodeQL errors" This reverts commit 2c804d941963c0be21d3aad07e6cd13568ab747b. Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context to some test files and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unnecessary data Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * translated a few measurements to Armenian Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * adjusted some things for better readability and maintainer support Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed one test case and some issues Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix CI (#142) * fix whitelist deployment Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * comment out tests to recreate grammars Signed-off-by: Evelina <ebakhturina@nvidia.com> * shorten test Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix jenkins Signed-off-by: Evelina <ebakhturina@nvidia.com> * cased for TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * revert debug changes Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix args default Signed-off-by: Evelina <ebakhturina@nvidia.com> * try parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * debug parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix sh tests for local SH launcher Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian TN (#137) * merged with main branch and fixed conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing some more conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixed a minor issue Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unused imports Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix: add "hy" language option for armenian Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> * added optional space for measurements after cardinals/decimals Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * added Armenian dot Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Marathi ITN (#134) * Added Marathi ITN Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adding jenkins test Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Travis Bartley <tbartley@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins fix (#150) * jenkins fix Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * missing _init_ for python Signed-off-by: Travis Bartley <tbartley@nvidia.com> * mislabled cache Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * r0.3.0 release (#151) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix text=line[text] to text=line[text_field] (#153) Signed-off-by: Sasha Meister <sasha.meister.work@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * use real string on docstring (#157) Signed-off-by: Kevin Sanders <kevin.sanders@dialpad.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh postprocess (#147) * Add support for postprocessor far in sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Choose between having a post processor or not Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update run_evaluate script for cased itn (#164) * update run_evaluate script for cased itn Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remove unused function from ar tn decimals (#165) * remove unused function from ar tn decimals Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * ZH sentence-level TN (#112) * Swedish telephone fix (#60) * port fix for telephone from swedish-itn branch Signed-off-by: Jim O'Regan <joregan@kth.se> * extend cardinal in non-deterministic mode Signed-off-by: Jim O'Regan <joregan@kth.se> * whitespace fixes Signed-off-by: Jim O'Regan <joregan@kth.se> * also fix in the verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * Update Jenkinsfile Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * log instead of print in graph_utils.py (#68) Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * CER estimation speedup for audio-based text normalization (#73) * Replaced jiwer with editdistance to speed up CER estimation Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measure coverage for TN and ITN (#62) * add measure coverage for TN and ITN Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update measure.py Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63) * upload es-ES and fr-FR g2p dicts Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add inits Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add NALA Spanish dict Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * rename Spanish and French dictionaries Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add Italian dictionary Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add country codes from hu (#77) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix electronic case for username (#75) * fix electronic username w/o . Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix ar test Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci dirs, enable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * 0.1.8 release (#79) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Codeswitched ES/EN ITN (#78) * Initial commit for ES-EN codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable export for es_en codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add whitelist, update weights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add tests for en_es, zone tagged separately in es Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix path to test data for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkinsfile - enable ES/EN tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add __init__.py files Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix issues with failed docker build - due to archiving of debian and issues with re2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove unused imports and variables Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update date Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable NBSP in sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update copyrights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache path in for ES/EN CI/CD Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * minor normalize.py edit for usability (#84) * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * documentation edits for grammar/clarity Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * added --output_field flag for command line interface Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Swedish ITN (#40) * force two digits for month Signed-off-by: Jim O'Regan <joregan@kth.se> * put it in a function, because I reject the garbage pre-commit.ci came up with Signed-off-by: Jim O'Regan <joregan@kth.se> * wrap some more pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * add graph pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * delete junk Signed-off-by: Jim O'Regan <joregan@kth.se> * my copyright Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser (copy from es) Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * add right tokens Signed-off-by: Jim O'Regan <joregan@kth.se> * some tweaks, more needed Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to ITN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * moved to tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * nothing actually fixed here Signed-off-by: Jim O'Regan <joregan@kth.se> * now most tests pass Signed-off-by: Jim O'Regan <joregan@kth.se> * electronic Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fractions Signed-off-by: Jim O'Regan <joregan@kth.se> * extend Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bare fractions is a bit of an overreach Signed-off-by: Jim O'Regan <joregan@kth.se> * whitelist Signed-off-by: Jim O'Regan <joregan@kth.se> * just inverting the TN whitelist tagger will not work/be useful Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from English Signed-off-by: Jim O'Regan <joregan@kth.se> * overwrite with version from en Signed-off-by: Jim O'Regan <joregan@kth.se> * add basic test case Signed-off-by: Jim O'Regan <joregan@kth.se> * fix call Signed-off-by: Jim O'Regan <joregan@kth.se> * swap tsv sides Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add optional_era variable Signed-off-by: Jim O'Regan <joregan@kth.se> * add test case Signed-off-by: Jim O'Regan <joregan@kth.se> * make deterministic default, like most of the others Signed-off-by: Jim O'Regan <joregan@kth.se> * also add lowercase versions Signed-off-by: Jim O'Regan <joregan@kth.se> * replacing NEMO_SPACE does not work either Signed-off-by: Jim O'Regan <joregan@kth.se> * increasing weight... did not work last time Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaking test cases, in case it was a sentence splitting issue. It was not Signed-off-by: Jim O'Regan <joregan@kth.se> * put the full stops back Signed-off-by: Jim O'Regan <joregan@kth.se> * add filler words Signed-off-by: Jim O'Regan <joregan@kth.se> * try splitting this out to see if it makes a difference Signed-off-by: Jim O'Regan <joregan@kth.se> * aha, this part should be non-deterministic only Signed-off-by: Jim O'Regan <joregan@kth.se> * single line only Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "increasing weight... did not work last time" This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996. Signed-off-by: Jim O'Regan <joregan@kth.se> * disabling ITN here makes TN work again(?) Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "disabling ITN here makes TN work again(?)" This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f. Signed-off-by: Jim O'Regan <joregan@kth.se> * changing the variable name fixes norm tests Signed-off-by: Jim O'Regan <joregan@kth.se> * change the variable names Signed-off-by: Jim O'Regan <joregan@kth.se> * add missing test tooling Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * add a piece for area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add country codes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * extend any_read_digit for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * country/area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * first attempt Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * remove country codes for the time being, makes things ambiguous Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove trailing whitespace Signed-off-by: Jim O'Regan <joregan@kth.se> * Update __init__.py Signed-off-by: Jim O’Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transform of TN tests Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transformation of TN decimal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * slight changes to date Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * include space Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen was not that Signed-off-by: Jim O'Regan <joregan@kth.se> * add functions from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * respect my own copyright xD Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading, this has been an oddity before Signed-off-by: Jim O'Regan <joregan@kth.se> * try changing this year declaration Signed-off-by: Jim O'Regan <joregan@kth.se> * add year + era Signed-off-by: Jim O'Regan <joregan@kth.se> * eliminate more module-level data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "eliminate more module-level data loading" This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a. Signed-off-by: Jim O'Regan <joregan@kth.se> * expose variables Signed-off-by: Jim O'Regan <joregan@kth.se> * extra param for itn mode Signed-off-by: Jim O'Regan <joregan@kth.se> * change call Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * fix parens Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapt/extend tests Signed-off-by: Jim O'Regan <joregan@kth.se> * fix dict init/change keys to something useful Signed-off-by: Jim O'Regan <joregan@kth.se> * initial stab at prefixed numbers Signed-off-by: Jim O'Regan <joregan@kth.se> * some adapting Signed-off-by: Jim O'Regan <joregan@kth.se> * insert kl. if absent Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comments Signed-off-by: Jim O'Regan <joregan@kth.se> * the relative prefixed times Signed-off-by: Jim O'Regan <joregan@kth.se> * + comments Signed-off-by: Jim O'Regan <joregan@kth.se> * enable time Signed-off-by: Jim O'Regan <joregan@kth.se> * space in both directions Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix hours to Signed-off-by: Jim O'Regan <joregan@kth.se> * split by before/after Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * fix if Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. 9 Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from en Signed-off-by: Jim O'Regan <joregan@kth.se> * keep only get_abs_path Signed-off-by: Jim O'Regan <joregan@kth.se> * imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add trimmed file Signed-off-by: Jim O'Regan <joregan@kth.se> * fix imports Signed-off-by: Jim O'Regan <joregan@kth.se> * two abs_paths... could be fun Signed-off-by: Jim O'Regan <joregan@kth.se> * minutes/seconds Signed-off-by: Jim O'Regan <joregan@kth.se> * suffix Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * one optional Signed-off-by: Jim O'Regan <joregan@kth.se> * export variable Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. or one of suffix/zone Signed-off-by: Jim O'Regan <joregan@kth.se> * already disambiguated Signed-off-by: Jim O'Regan <joregan@kth.se> * closure Signed-off-by: Jim O'Regan <joregan@kth.se> * do not insert kl. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix test case Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix spelling Signed-off-by: Jim O'Regan <joregan@kth.se> * Delete measure.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Delete money.py Signed-off-by: Jim O’Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused test pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from es Signed-off-by: Jim O'Regan <joregan@kth.se> * add SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add/update __init__ Signed-off-by: Jim O'Regan <joregan@kth.se> * blank line Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix lang Signed-off-by: Jim O'Regan <joregan@kth.se> * fix decimal verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * remove year, conflicts with cardinal Signed-off-by: Jim O'Regan <joregan@kth.se> * space before, not after Signed-off-by: Jim O'Regan <joregan@kth.se> * fix cardinal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * spurious deletion Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * re-enable SV TN; enable SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "re-enable SV TN; enable SV ITN" This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix singulras Signed-off-by: Jim O'Regan <joregan@kth.se> * add an export Signed-off-by: Jim O'Regan <joregan@kth.se> * change integer graph Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move spaces Signed-off-by: Jim O'Regan <joregan@kth.se> * use cdrewrite Signed-off-by: Jim O'Regan <joregan@kth.se> * just EOS/BOS Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <joregan@kth.se> * omit en/ett, because they are also articles Signed-off-by: Jim O'Regan <joregan@kth.se> * uncomment Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused Signed-off-by: Jim O'Regan <joregan@kth.se> * strip spaces from decimal part Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * partial fix, not what I wanted Signed-off-by: Jim O'Regan <joregan@kth.se> * move comment Signed-off-by: Jim O'Regan <joregan@kth.se> * en/ett cannot work in itn case Signed-off-by: Jim O'Regan <joregan@kth.se> * be more deliberate in graph construction Signed-off-by: Jim O'Regan <joregan@kth.se> * accept both Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * +2 tests Signed-off-by: Jim O'Regan <joregan@kth.se> * (try to) accept singular quantities for plurals Signed-off-by: Jim O'Regan <joregan@kth.se> * retry Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * oops Signed-off-by: Jim O'Regan <joregan@kth.se> * replace Signed-off-by: Jim O'Regan <joregan@kth.se> * arcmap Signed-off-by: Jim O'Regan <joregan@kth.se> * version without ones Signed-off-by: Jim O'Regan <joregan@kth.se> * add another test Signed-off-by: Jim O'Regan <joregan@kth.se> * change graph Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of this, this is where it goes wrong Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * add a test Signed-off-by: Jim O'Regan <joregan@kth.se> * multiple states from both ones, try removing and readding Signed-off-by: Jim O'Regan <joregan@kth.se> * remove ones, see if that fixes at least the bare quantities Signed-off-by: Jim O'Regan <joregan@kth.se> * works in the repl, dunno why it still breaks Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * move definition Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * another test Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * local declaration, seems to not be working Signed-off-by: Jim O'Regan <joregan@kth.se> * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * match verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix last two failing tests Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing tests for telephone and word Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused variable Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of convert_space, tests fail Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "put convert_spaces back, change test file; pytest fails" This reverts commit a7bb7489137b8026aab02aff64df39e874630043. Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails, take 2 Signed-off-by: Jim O'Regan <joregan@kth.se> * deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk Signed-off-by: Jim O'Regan <joregan@kth.se> * try converting the non-breaking spaces in the shell script Signed-off-by: Jim O'Regan <joregan@kth.se> * wrong place Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * fix path Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused Signed-off-by: Jim O'Regan <joregan@kth.se> * Update date.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Update time.py Signed-off-by: Jim O’Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix comment Signed-off-by: Jim O’Regan <joregan@kth.se> * trim comments Signed-off-by: Jim O’Regan <joregan@kth.se> * remove commented line Signed-off-by: Jim O’Regan <joregan@kth.se> * en halv Signed-off-by: Jim O’Regan <joregan@kth.se> * Update test_sparrowhawk_inverse_text_normalization.sh Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Italian_TN (#67) * add TN italian Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix init Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix LOCATION Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * modify graph_utils Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * correct decimals Signed-off-by: 10000 GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix measure Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh itn (#74) * Add ZH ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix copyrights and code cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove invalid tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve CodeQL issues Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing 'zh' option for ITN and correct comment Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update __init__.py Change to zh instead of en for the imports. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for decimal test data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * update for langauge import Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for Chinese punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * a new class for whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * PYNINI_AVAILABLE = False Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to file import format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * caught duplicates, removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates, arranges for CHInese Yuan updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates accordingly to the comments from last PR. Recreated some of the files due to format issues Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * re-added this file to avoid data file import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated gramamr according to last PR. Removed the acceptance of 千 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated according to last PR. Removed comma after decimal points Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for Fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for money and updated according to last PR. Plus process of 元 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar. updates due to the updates in cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * arrangements Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added whitelist grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * word grammar for non-classified items Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated cardinal, decimal, time, itn data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to last PR Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to the updates for cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for more Mandarin punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR. removing am pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * adjustment on the weight Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the targger updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the time tagger Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to changes in tagger on am and pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * verbalizer for fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for mandarin grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * kept this file because using English utils results in data namin error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * merge conflict Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed import os Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed variables Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue, reccreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue recreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed codeing style/format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed coding style and format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicated graph for 毛 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unnecessary comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * unnecessary comment removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test file updated for more cases Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated with a comment explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated the file explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added Mandarin as zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing for dplication Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused NEMO objects Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix test file failures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix file failtures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failture Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adap to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adapt to grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adopt to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixing pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed // for zhtn/itn cache Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Update inverse_normalize.py Added zh as a selection to pass Jenkins checks. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated pynini_export.py file to create far files (#88) Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * readd Swedish (#87) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn 0712 (#89) * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and fixings according to document on natonal gideline Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fraction updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * money updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * punctuation grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * time gramamr updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * tokenizaer updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates on certificate Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * data updated and added due to updates and chanegs to the existing grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * cardinal updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * date grammar changed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test data added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test python file edits Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for tn1.0 and previous tn grammar from contribution Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test cases updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fixed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * dates updated for init files Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated the date for zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added back the itn tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added back measure and math from previou TN Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated for tests reruns Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated weights Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn char (#95) * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code stle Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * audio-based TN fix for empty pred_text/text (#92) * fix for empty pred_text Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unittests Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix pytest Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * pip 1.2.0 Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * French tn (#91) * add tests for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add fr tn for cardinals, decimals, fractions and ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * delete it far files from tools Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add languages to run_evaluate Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * remove ambiguous spacing Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * enable sh testing for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile cache date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix test for ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update tn cache for fr Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * resolve codeql issues Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Add whitelist_tech.tsv (#96) Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zhitn 0727 (#93) * updates on itn grammar to pass sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats for sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates fro sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fix Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for coding style and sparrowhawk test Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated classes for tests on whitelist and word grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for tests on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_word.py Removed unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Removed imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Removing imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile changed zh cache to 07-27-23 as it is the latest update. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Es tn romans fix (#98) * fix es tn roman exceptions Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update eval script for ITN Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * codeql fix Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Change docker image (#102) Change docker image to one including sparrowhawk Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Print warning instead exception (#97) * raise text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * text arg Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * Failed text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * add logger Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * logger Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * NeMo-text-processing Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * info level Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Normalizer.select_verbalizer Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Exception Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * restart ci Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * warning regardless of verbose flag (#107) * warning Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * self.verbose Signed-off-by: Nikolay Karpov <karpnv@gmail.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Unpin setuptools (#106) Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixed warnings: File is not always closes. (#113) Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix bug #111 (ar currencies) (#117) * fix bug #111 (ar currencies) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci folder Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Logging clean up + IT TN fix (#118) * fix utils and it TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix logging Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * add IT TN to CI Signed-off-by: Evelina <ebakhturina@nvidia.com> * update patch Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Time_IT_TN (#105) * add time verbalizer Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add time tagger and verba Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add pytest time Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeQL Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix numbers with eight Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * rewrote tokenizer Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed the file and replaced it with char in 1.8 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins file update Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * to fix tn bug@ xuesong Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * tn bug Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixeds and updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adjustments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * testing commit Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating etst cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates adapting to graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated …
BuyuanCui
added a commit
that referenced
this pull request
Sep 26, 2024
* Swedish telephone fix (#60) * port fix for telephone from swedish-itn branch Signed-off-by: Jim O'Regan <joregan@kth.se> * extend cardinal in non-deterministic mode Signed-off-by: Jim O'Regan <joregan@kth.se> * whitespace fixes Signed-off-by: Jim O'Regan <joregan@kth.se> * also fix in the verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * Update Jenkinsfile Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * log instead of print in graph_utils.py (#68) Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * CER estimation speedup for audio-based text normalization (#73) * Replaced jiwer with editdistance to speed up CER estimation Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measure coverage for TN and ITN (#62) * add measure coverage for TN and ITN Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update measure.py Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63) * upload es-ES and fr-FR g2p dicts Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add inits Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add NALA Spanish dict Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * rename Spanish and French dictionaries Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add Italian dictionary Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add country codes from hu (#77) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix electronic case for username (#75) * fix electronic username w/o . Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix ar test Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci dirs, enable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * 0.1.8 release (#79) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Codeswitched ES/EN ITN (#78) * Initial commit for ES-EN codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable export for es_en codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add whitelist, update weights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add tests for en_es, zone tagged separately in es Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix path to test data for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkinsfile - enable ES/EN tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add __init__.py files Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix issues with failed docker build - due to archiving of debian and issues with re2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove unused imports and variables Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update date Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable NBSP in sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update copyrights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache path in for ES/EN CI/CD Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * minor normalize.py edit for usability (#84) * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * documentation edits for grammar/clarity Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * added --output_field flag for command line interface Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Swedish ITN (#40) * force two digits for month Signed-off-by: Jim O'Regan <joregan@kth.se> * put it in a function, because I reject the garbage pre-commit.ci came up with Signed-off-by: Jim O'Regan <joregan@kth.se> * wrap some more pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * add graph pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * delete junk Signed-off-by: Jim O'Regan <joregan@kth.se> * my copyright Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser (copy from es) Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * add right tokens Signed-off-by: Jim O'Regan <joregan@kth.se> * some tweaks, more needed Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to ITN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * moved to tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * nothing actually fixed here Signed-off-by: Jim O'Regan <joregan@kth.se> * now most tests pass Signed-off-by: Jim O'Regan <joregan@kth.se> * electronic Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fractions Signed-off-by: Jim O'Regan <joregan@kth.se> * extend Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bare fractions is a bit of an overreach Signed-off-by: Jim O'Regan <joregan@kth.se> * whitelist Signed-off-by: Jim O'Regan <joregan@kth.se> * just inverting the TN whitelist tagger will not work/be useful Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from English Signed-off-by: Jim O'Regan <joregan@kth.se> * overwrite with version from en Signed-off-by: Jim O'Regan <joregan@kth.se> * add basic test case Signed-off-by: Jim O'Regan <joregan@kth.se> * fix call Signed-off-by: Jim O'Regan <joregan@kth.se> * swap tsv sides Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add optional_era variable Signed-off-by: Jim O'Regan <joregan@kth.se> * add test case Signed-off-by: Jim O'Regan <joregan@kth.se> * make deterministic default, like most of the others Signed-off-by: Jim O'Regan <joregan@kth.se> * also add lowercase versions Signed-off-by: Jim O'Regan <joregan@kth.se> * replacing NEMO_SPACE does not work either Signed-off-by: Jim O'Regan <joregan@kth.se> * increasing weight... did not work last time Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaking test cases, in case it was a sentence splitting issue. It was not Signed-off-by: Jim O'Regan <joregan@kth.se> * put the full stops back Signed-off-by: Jim O'Regan <joregan@kth.se> * add filler words Signed-off-by: Jim O'Regan <joregan@kth.se> * try splitting this out to see if it makes a difference Signed-off-by: Jim O'Regan <joregan@kth.se> * aha, this part should be non-deterministic only Signed-off-by: Jim O'Regan <joregan@kth.se> * single line only Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "increasing weight... did not work last time" This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996. Signed-off-by: Jim O'Regan <joregan@kth.se> * disabling ITN here makes TN work again(?) Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "disabling ITN here makes TN work again(?)" This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f. Signed-off-by: Jim O'Regan <joregan@kth.se> * changing the variable name fixes norm tests Signed-off-by: Jim O'Regan <joregan@kth.se> * change the variable names Signed-off-by: Jim O'Regan <joregan@kth.se> * add missing test tooling Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * add a piece for area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add country codes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * extend any_read_digit for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * country/area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * first attempt Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * remove country codes for the time being, makes things ambiguous Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove trailing whitespace Signed-off-by: Jim O'Regan <joregan@kth.se> * Update __init__.py Signed-off-by: Jim O’Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transform of TN tests Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transformation of TN decimal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * slight changes to date Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * include space Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen was not that Signed-off-by: Jim O'Regan <joregan@kth.se> * add functions from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * respect my own copyright xD Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading, this has been an oddity before Signed-off-by: Jim O'Regan <joregan@kth.se> * try changing this year declaration Signed-off-by: Jim O'Regan <joregan@kth.se> * add year + era Signed-off-by: Jim O'Regan <joregan@kth.se> * eliminate more module-level data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "eliminate more module-level data loading" This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a. Signed-off-by: Jim O'Regan <joregan@kth.se> * expose variables Signed-off-by: Jim O'Regan <joregan@kth.se> * extra param for itn mode Signed-off-by: Jim O'Regan <joregan@kth.se> * change call Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * fix parens Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapt/extend tests Signed-off-by: Jim O'Regan <joregan@kth.se> * fix dict init/change keys to something useful Signed-off-by: Jim O'Regan <joregan@kth.se> * initial stab at prefixed numbers Signed-off-by: Jim O'Regan <joregan@kth.se> * some adapting Signed-off-by: Jim O'Regan <joregan@kth.se> * insert kl. if absent Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comments Signed-off-by: Jim O'Regan <joregan@kth.se> * the relative prefixed times Signed-off-by: Jim O'Regan <joregan@kth.se> * + comments Signed-off-by: Jim O'Regan <joregan@kth.se> * enable time Signed-off-by: Jim O'Regan <joregan@kth.se> * space in both directions Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix hours to Signed-off-by: Jim O'Regan <joregan@kth.se> * split by before/after Signed-off-by: Jim O'Regan <joregan@ 10000 kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * fix if Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. 9 Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from en Signed-off-by: Jim O'Regan <joregan@kth.se> * keep only get_abs_path Signed-off-by: Jim O'Regan <joregan@kth.se> * imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add trimmed file Signed-off-by: Jim O'Regan <joregan@kth.se> * fix imports Signed-off-by: Jim O'Regan <joregan@kth.se> * two abs_paths... could be fun Signed-off-by: Jim O'Regan <joregan@kth.se> * minutes/seconds Signed-off-by: Jim O'Regan <joregan@kth.se> * suffix Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * one optional Signed-off-by: Jim O'Regan <joregan@kth.se> * export variable Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. or one of suffix/zone Signed-off-by: Jim O'Regan <joregan@kth.se> * already disambiguated Signed-off-by: Jim O'Regan <joregan@kth.se> * closure Signed-off-by: Jim O'Regan <joregan@kth.se> * do not insert kl. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix test case Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix spelling Signed-off-by: Jim O'Regan <joregan@kth.se> * Delete measure.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Delete money.py Signed-off-by: Jim O’Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused test pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from es Signed-off-by: Jim O'Regan <joregan@kth.se> * add SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add/update __init__ Signed-off-by: Jim O'Regan <joregan@kth.se> * blank line Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix lang Signed-off-by: Jim O'Regan <joregan@kth.se> * fix decimal verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * remove year, conflicts with cardinal Signed-off-by: Jim O'Regan <joregan@kth.se> * space before, not after Signed-off-by: Jim O'Regan <joregan@kth.se> * fix cardinal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * spurious deletion Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * re-enable SV TN; enable SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "re-enable SV TN; enable SV ITN" This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix singulras Signed-off-by: Jim O'Regan <joregan@kth.se> * add an export Signed-off-by: Jim O'Regan <joregan@kth.se> * change integer graph Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move spaces Signed-off-by: Jim O'Regan <joregan@kth.se> * use cdrewrite Signed-off-by: Jim O'Regan <joregan@kth.se> * just EOS/BOS Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <joregan@kth.se> * omit en/ett, because they are also articles Signed-off-by: Jim O'Regan <joregan@kth.se> * uncomment Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused Signed-off-by: Jim O'Regan <joregan@kth.se> * strip spaces from decimal part Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * partial fix, not what I wanted Signed-off-by: Jim O'Regan <joregan@kth.se> * move comment Signed-off-by: Jim O'Regan <joregan@kth.se> * en/ett cannot work in itn case Signed-off-by: Jim O'Regan <joregan@kth.se> * be more deliberate in graph construction Signed-off-by: Jim O'Regan <joregan@kth.se> * accept both Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * +2 tests Signed-off-by: Jim O'Regan <joregan@kth.se> * (try to) accept singular quantities for plurals Signed-off-by: Jim O'Regan <joregan@kth.se> * retry Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * oops Signed-off-by: Jim O'Regan <joregan@kth.se> * replace Signed-off-by: Jim O'Regan <joregan@kth.se> * arcmap Signed-off-by: Jim O'Regan <joregan@kth.se> * version without ones Signed-off-by: Jim O'Regan <joregan@kth.se> * add another test Signed-off-by: Jim O'Regan <joregan@kth.se> * change graph Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of this, this is where it goes wrong Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * add a test Signed-off-by: Jim O'Regan <joregan@kth.se> * multiple states from both ones, try removing and readding Signed-off-by: Jim O'Regan <joregan@kth.se> * remove ones, see if that fixes at least the bare quantities Signed-off-by: Jim O'Regan <joregan@kth.se> * works in the repl, dunno why it still breaks Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * move definition Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * another test Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * local declaration, seems to not be working Signed-off-by: Jim O'Regan <joregan@kth.se> * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * match verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix last two failing tests Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing tests for telephone and word Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused variable Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of convert_space, tests fail Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "put convert_spaces back, change test file; pytest fails" This reverts commit a7bb7489137b8026aab02aff64df39e874630043. Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails, take 2 Signed-off-by: Jim O'Regan <joregan@kth.se> * deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk Signed-off-by: Jim O'Regan <joregan@kth.se> * try converting the non-breaking spaces in the shell script Signed-off-by: Jim O'Regan <joregan@kth.se> * wrong place Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * fix path Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused Signed-off-by: Jim O'Regan <joregan@kth.se> * Update date.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Update time.py Signed-off-by: Jim O’Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix comment Signed-off-by: Jim O’Regan <joregan@kth.se> * trim comments Signed-off-by: Jim O’Regan <joregan@kth.se> * remove commented line Signed-off-by: Jim O’Regan <joregan@kth.se> * en halv Signed-off-by: Jim O’Regan <joregan@kth.se> * Update test_sparrowhawk_inverse_text_normalization.sh Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Italian_TN (#67) * add TN italian Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix init Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix LOCATION Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * modify graph_utils Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * correct decimals Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix measure Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh itn (#74) * Add ZH ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix copyrights and code cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove invalid tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve CodeQL issues Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing 'zh' option for ITN and correct comment Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update __init__.py Change to zh instead of en for the imports. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for decimal test data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * update for langauge import Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for Chinese punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * a new class for whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * PYNINI_AVAILABLE = False Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to file import format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * caught duplicates, removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates, arranges for CHInese Yuan updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates accordingly to the comments from last PR. Recreated some of the files due to format issues Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * re-added this file to avoid data file import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated gramamr according to last PR. Removed the acceptance of 千 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated according to last PR. Removed comma after decimal points Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for Fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for money and updated according to last PR. Plus process of 元 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar. updates due to the updates in cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * arrangements Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added whitelist grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * word grammar for non-classified items Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated cardinal, decimal, time, itn data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to last PR Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to the updates for cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for more Mandarin punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR. removing am pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * adjustment on the weight Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the targger updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the time tagger Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to changes in tagger on am and pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * verbalizer for fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for mandarin grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * kept this file because using English utils results in data namin error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * merge conflict Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed import os Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed variables Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue, reccreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue recreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed codeing style/format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed coding style and format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicated graph for 毛 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unnecessary comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * unnecessary comment removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test file updated for more cases Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated with a comment explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated the file explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added Mandarin as zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing for dplication Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused NEMO objects Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix test file failures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix file failtures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failture Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adap to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adapt to grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adopt to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixing pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed // for zhtn/itn cache Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Update inverse_normalize.py Added zh as a selection to pass Jenkins checks. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated pynini_export.py file to create far files (#88) Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * readd Swedish (#87) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn 0712 (#89) * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and fixings according to document on natonal gideline Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fraction updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * money updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * punctuation grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * time gramamr updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * tokenizaer updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates on certificate Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * data updated and added due to updates and chanegs to the existing grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * cardinal updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * date grammar changed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test data added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test python file edits Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for tn1.0 and previous tn grammar from contribution Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test cases updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fixed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * dates updated for init files Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated the date for zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added back the itn tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added back measure and math from previou TN Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated for tests reruns Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated weights Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn char (#95) * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code stle Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * audio-based TN fix for empty pred_text/text (#92) * fix for empty pred_text Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unittests Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix pytest Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * pip 1.2.0 Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * French tn (#91) * add tests for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add fr tn for cardinals, decimals, fractions and ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * delete it far files from tools Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add languages to run_evaluate Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * remove ambiguous spacing Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * enable sh testing for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile cache date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix test for ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update tn cache for fr Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * resolve codeql issues Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Add whitelist_tech.tsv (#96) Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zhitn 0727 (#93) * updates on itn grammar to pass sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats for sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates fro sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fix Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for coding style and sparrowhawk test Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated classes for tests on whitelist and word grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for tests on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_word.py Removed unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Removed imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Removing imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile changed zh cache to 07-27-23 as it is the latest update. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Es tn romans fix (#98) * fix es tn roman exceptions Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update eval script for ITN Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * codeql fix Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Change docker image (#102) Change docker image to one including sparrowhawk Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Print warning instead exception (#97) * raise text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * text arg Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * Failed text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * add logger Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * logger Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * NeMo-text-processing Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * info level Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Normalizer.select_verbalizer Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Exception Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * restart ci Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * warning regardless of verbose flag (#107) * warning Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * self.verbose Signed-off-by: Nikolay Karpov <karpnv@gmail.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Unpin setuptools (#106) Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixed warnings: File is not always closes. (#113) Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix bug #111 (ar currencies) (#117) * fix bug #111 (ar currencies) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci folder Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Logging clean up + IT TN fix (#118) * fix utils and it TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix logging Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * add IT TN to CI Signed-off-by: Evelina <ebakhturina@nvidia.com> * update patch Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Time_IT_TN (#105) * add time verbalizer Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add time tagger and verba Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add pytest time Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeQL Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix numbers with eight Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * rewrote tokenizer Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed the file and replaced it with char in 1.8 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins file update Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * to fix tn bug@ xuesong Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * tn bug Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixeds and updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adjustments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * testing commit Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating etst cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates adapting to graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated cases for SH tests Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added some sentences Signed-off-by: Alex Cui <alexcui1994@gmail.com> * test cases update Signed-off-by: Alex Cui <alexcui1994@gmail.com> * solving rebase issue, repushing changes Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving conflict Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixings according to ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixings according to the ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * notused removing Signed-off-by: Alex Cui <alexcui1994@gmail.com> * format issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * formt issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing unused files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unused files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remiving unsed files; Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added sentences as test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added senetnces as test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed commentyed out tests Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating dates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * attemps to fix bug Signed-off-by: Alex Cui <alexcui1994@gmail.com> * inprocess of fixing the bug Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixing existing issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated graph_utils, tokenize and classify, and word graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added bacl the ppostprocessor far creation Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated NEMO_NOT_ALPHA as a new variable Signed-off-by: Alex Cui <alexcui1994@gmail.com> * far files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * combiedn into measure Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing and combined to meaasure Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing, not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to solve the space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving sh issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving sh test issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adding anands updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * data updated for measure and whitelist Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing fraction and math part Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing preprocessor, updating measure, adding shitelist cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing processor, modification for sp test, shitelist and word Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating zh date Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * realized itn being cvommented out, adding back Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * trying to run zh tn separately because it takes long time to run Signed-off-by: Alex Cui <alexcui1994@gmail.com> * modification to ru zh tn separately Signed-off-by: Alex Cui <alexcui1994@gmail.com> * independent zh tnitn tests for more time Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adding lines to save far file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates for reducing testing time Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * for ounct graph Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing used graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * format and removing used comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing this one, not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remove unused commentss� Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Delete tools/text_processing_deployment/zh directory Removing far files. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * updates according to the github comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * punct grammar Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_cases_cardinal.txt Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Dockerfile Copied from main branch ( which included Anand's updates) Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update launch.sh Found differences in the file. Fixing it back. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Saw word ITN being commented out. Adding it back. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update money.py Found cardinal grammar not accepting suffix. Fixed it. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update Jenkinsfile Removed duplicated zh test from line 230s Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update utils.py Addressing bug raised in bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update graph_utils.py Addressing bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update measure.py Fixing code style, removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Fixing code style, removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update measure.py Removing unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update post_processing.py Removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update post_processing.py Removing unused import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update cardinal.py Deleting unused graph Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Removing import pynini Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py removing pynini import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update verbalize.py removing pynutil import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update post_processing.py removing punct graph imported Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_sparrowhawk_normalization.sh Update on test issue for Docker file locations Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_ordinal.py Fixing style. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Delete nemo_text_processing/text_normalization/zh/taggers/math_symbol.py Removing because it's not one of the semiotic classes. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Delete nemo_text_processing/text_normalization/zh/verbalizers/math_symbol.py Removing because it's not one of the semiotic classes. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile Updating Jenkins date Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Jim O’Regan <joregan@kth.se> Co-authored-by: Enno Hermann <Eginhard@users.noreply.github.com> Co-authored-by: Vitaly Lavrukhin <vitaly.lavrukhin@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Enas Albasiri <71229149+ealbasiri@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: lleaver <137942999+lleaver@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Jim O’Regan <jaoregan@tcd.ie> Co-authored-by: Giacomo Leone Maria Cavallini <72698188+GiacomoLeoneMaria@users.noreply.github.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: Nikolay Karpov <karpnv@gmail.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Peter Plantinga <plantinga.peter@proton.me> Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
BuyuanCui
added a commit
that referenced
this pull request
Sep 26, 2024
* Swedish telephone fix (#60) * port fix for telephone from swedish-itn branch Signed-off-by: Jim O'Regan <joregan@kth.se> * extend cardinal in non-deterministic mode Signed-off-by: Jim O'Regan <joregan@kth.se> * whitespace fixes Signed-off-by: Jim O'Regan <joregan@kth.se> * also fix in the verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * Update Jenkinsfile Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * log instead of print in graph_utils.py (#68) Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * CER estimation speedup for audio-based text normalization (#73) * Replaced jiwer with editdistance to speed up CER estimation Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measure coverage for TN and ITN (#62) * add measure coverage for TN and ITN Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update measure.py Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63) * upload es-ES and fr-FR g2p dicts Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add inits Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add NALA Spanish dict Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * rename Spanish and French dictionaries Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add Italian dictionary Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add country codes from hu (#77) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix electronic case for username (#75) * fix electronic username w/o . Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix ar test Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci dirs, enable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * 0.1.8 release (#79) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Codeswitched ES/EN ITN (#78) * Initial commit for ES-EN codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable export for es_en codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add whitelist, update weights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add tests for en_es, zone tagged separately in es Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix path to test data for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkinsfile - enable ES/EN tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add __init__.py files Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix issues with failed docker build - due to archiving of debian and issues with re2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove unused imports and variables Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update date Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable NBSP in sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update copyrights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache path in for ES/EN CI/CD Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * minor normalize.py edit for usability (#84) * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * documentation edits for grammar/clarity Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * added --output_field flag for command line interface Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Swedish ITN (#40) * force two digits for month Signed-off-by: Jim O'Regan <joregan@kth.se> * put it in a function, because I reject the garbage pre-commit.ci came up with Signed-off-by: Jim O'Regan <joregan@kth.se> * wrap some more pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * add graph pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * delete junk Signed-off-by: Jim O'Regan <joregan@kth.se> * my copyright Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser (copy from es) Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * add right tokens Signed-off-by: Jim O'Regan <joregan@kth.se> * some tweaks, more needed Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to ITN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * moved to tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * nothing actually fixed here Signed-off-by: Jim O'Regan <joregan@kth.se> * now most tests pass Signed-off-by: Jim O'Regan <joregan@kth.se> * electronic Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fractions Signed-off-by: Jim O'Regan <joregan@kth.se> * extend Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bare fractions is a bit of an overreach Signed-off-by: Jim O'Regan <joregan@kth.se> * whitelist Signed-off-by: Jim O'Regan <joregan@kth.se> * just inverting the TN whitelist tagger will not work/be useful Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from English Signed-off-by: Jim O'Regan <joregan@kth.se> * overwrite with version from en Signed-off-by: Jim O'Regan <joregan@kth.se> * add basic test case Signed-off-by: Jim O'Regan <joregan@kth.se> * fix call Signed-off-by: Jim O'Regan <joregan@kth.se> * swap tsv sides Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add optional_era variable Signed-off-by: Jim O'Regan <joregan@kth.se> * add test case Signed-off-by: Jim O'Regan <joregan@kth.se> * make deterministic default, like most of the others Signed-off-by: Jim O'Regan <joregan@kth.se> * also add lowercase versions Signed-off-by: Jim O'Regan <joregan@kth.se> * replacing NEMO_SPACE does not work either Signed-off-by: Jim O'Regan <joregan@kth.se> * increasing weight... did not work last time Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaking test cases, in case it was a sentence splitting issue. It was not Signed-off-by: Jim O'Regan <joregan@kth.se> * put the full stops back Signed-off-by: Jim O'Regan <joregan@kth.se> * add filler words Signed-off-by: Jim O'Regan <joregan@kth.se> * try splitting this out to see if it makes a difference Signed-off-by: Jim O'Regan <joregan@kth.se> * aha, this part should be non-deterministic only Signed-off-by: Jim O'Regan <joregan@kth.se> * single line only Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "increasing weight... did not work last time" This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996. Signed-off-by: Jim O'Regan <joregan@kth.se> * disabling ITN here makes TN work again(?) Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "disabling ITN here makes TN work again(?)" This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f. Signed-off-by: Jim O'Regan <joregan@kth.se> * changing the variable name fixes norm tests Signed-off-by: Jim O'Regan <joregan@kth.se> * change the variable names Signed-off-by: Jim O'Regan <joregan@kth.se> * add missing test tooling Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * add a piece for area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add country codes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * extend any_read_digit for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * country/area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * first attempt Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * remove country codes for the time being, makes things ambiguous Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove trailing whitespace Signed-off-by: Jim O'Regan <joregan@kth.se> * Update __init__.py Signed-off-by: Jim O’Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transform of TN tests Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transformation of TN decimal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * slight changes to date Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * include space Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen was not that Signed-off-by: Jim O'Regan <joregan@kth.se> * add functions from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * respect my own copyright xD Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading, this has been an oddity before Signed-off-by: Jim O'Regan <joregan@kth.se> * try changing this year declaration Signed-off-by: Jim O'Regan <joregan@kth.se> * add year + era Signed-off-by: Jim O'Regan <joregan@kth.se> * eliminate more module-level data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "eliminate more module-level data loading" This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a. Signed-off-by: Jim O'Regan <joregan@kth.se> * expose variables Signed-off-by: Jim O'Regan <joregan@kth.se> * extra param for itn mode Signed-off-by: Jim O'Regan <joregan@kth.se> * change call Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * fix parens Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapt/extend tests Signed-off-by: Jim O'Regan <joregan@kth.se> * fix dict init/change keys to something useful Signed-off-by: Jim O'Regan <joregan@kth.se> * initial stab at prefixed numbers Signed-off-by: Jim O'Regan <joregan@kth.se> * some adapting Signed-off-by: Jim O'Regan <joregan@kth.se> * insert kl. if absent Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comments Signed-off-by: Jim O'Regan <joregan@kth.se> * the relative prefixed times Signed-off-by: Jim O'Regan <joregan@kth.se> * + comments Signed-off-by: Jim O'Regan <joregan@kth.se> * enable time Signed-off-by: Jim O'Regan <joregan@kth.se> * space in both directions Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix hours to Signed-off-by: Jim O'Regan <joregan@kth.se> * split by before/after Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * fix if Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. 9 Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from en Signed-off-by: Jim O'Regan <joregan@kth.se> * keep only get_abs_path Signed-off-by: Jim O'Regan <joregan@kth.se> * imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add trimmed file Signed-off-by: Jim O'Regan <joregan@kth.se> * fix imports Signed-off-by: Jim O'Regan <joregan@kth.se> * two abs_paths... could be fun Signed-off-by: Jim O'Regan <joregan@kth.se> * minutes/seconds Signed-off-by: Jim O'Regan <joregan@kth.se> * suffix Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * one optional Signed-off-by: Jim O'Regan <joregan@kth.se> * export variable Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. or one of suffix/zone Signed-off-by: Jim O'Regan <joregan@kth.se> * already disambiguated Signed-off-by: Jim O'Regan <joregan@kth.se> * closure Signed-off-by: Jim O'Regan <joregan@kth.se> * do not insert kl. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix test case Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix spelling Signed-off-by: Jim O'Regan <joregan@kth.se> * Delete measure.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Delete money.py Signed-off-by: Jim O’Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused test pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from es Signed-off-by: Jim O'Regan <joregan@kth.se> * add SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add/update __init__ Signed-off-by: Jim O'Regan <joregan@kth.se> * blank line Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix lang Signed-off-by: Jim O'Regan <joregan@kth.se> * fix decimal verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * remove year, conflicts with cardinal Signed-off-by: Jim O'Regan <joregan@kth.se> * space before, not after Signed-off-by: Jim O'Regan <joregan@kth.se> * fix cardinal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * spurious deletion Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * re-enable SV TN; enable SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "re-enable SV TN; enable SV ITN" This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix singulras Signed-off-by: Jim O'Regan <joregan@kth.se> * add an export Signed-off-by: Jim O'Regan <joregan@kth.se> * change integer graph Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move spaces Signed-off-by: Jim O'Regan <joregan@kth.se> * use cdrewrite Signed-off-by: Jim O'Regan <joregan@kth.se> * just EOS/BOS Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <joregan@kth.se> * omit en/ett, because they are also articles Signed-off-by: Jim O'Regan <joregan@kth.se> * uncomment Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused Signed-off-by: Jim O'Regan <joregan@kth.se> * strip spaces from decimal part Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * partial fix, not what I wanted Signed-off-by: Jim O'Regan <joregan@kth.se> * move comment Signed-off-by: Jim O'Regan <joregan@kth.se> * en/ett cannot work in itn case Signed-off-by: Jim O'Regan <joregan@kth.se> * be more deliberate in graph construction Signed-off-by: Jim O'Regan <joregan@kth.se> * accept both Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * +2 tests Signed-off-by: Jim O'Regan <joregan@kth.se> * (try to) accept singular quantities for plurals Signed-off-by: Jim O'Regan <joregan@kth.se> * retry Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * oops Signed-off-by: Jim O'Regan <joregan@kth.se> * replace Signed-off-by: Jim O'Regan <joregan@kth.se> * arcmap Signed-off-by: Jim O'Regan <joregan@kth.se> * version without ones Signed-off-by: Jim O'Regan <joregan@kth.se> * add another test Signed-off-by: Jim O'Regan <joregan@kth.se> * change graph Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of this, this is where it goes wrong Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * add a test Signed-off-by: Jim O'Regan <joregan@kth.se> * multiple states from both ones, try removing and readding Signed-off-by: Jim O'Regan <joregan@kth.se> * remove ones, see if that fixes at least the bare quantities Signed-off-by: Jim O'Regan <joregan@kth.se> * works in the repl, dunno why it still breaks Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * move definition Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * another test Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * local declaration, seems to not be working Signed-off-by: Jim O'Regan <joregan@kth.se> * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * match verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix last two failing tests Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing tests for telephone and word Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused variable Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of convert_space, tests fail Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "put convert_spaces back, change test file; pytest fails" This reverts commit a7bb7489137b8026aab02aff64df39e874630043. Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails, take 2 Signed-off-by: Jim O'Regan <joregan@kth.se> * deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk Signed-off-by: Jim O'Regan <joregan@kth.se> * try converting the non-breaking spaces in the shell script Signed-off-by: Jim O'Regan <joregan@kth.se> * wrong place Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * fix path Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused Signed-off-by: Jim O'Regan <joregan@kth.se> * Update date.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Update time.py Signed-off-by: Jim O’Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix comment Signed-off-by: Jim O’Regan <joregan@kth.se> * trim comments Signed-off-by: Jim O’Regan <joregan@kth.se> * remove commented line Signed-off-by: Jim O’Regan <joregan@kth.se> * en halv Signed-off-by: Jim O’Regan <joregan@kth.se> * Update test_sparrowhawk_inverse_text_normalization.sh Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Italian_TN (#67) * add TN italian Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix init Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix LOCATION Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * modify graph_utils Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * correct decimals Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix measure Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh itn (#74) * Add ZH ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix copyrights and code cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove invalid tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve CodeQL issues Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing 'zh' option for ITN and correct comment Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update __init__.py Change to zh instead of en for the imports. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for decimal test data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * update for langauge import Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for Chinese punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * a new class for whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * PYNINI_AVAILABLE = False Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to file import format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * caught duplicates, removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates, arranges for CHInese Yuan updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates accordingly to the comments from last PR. Recreated some of the files due to format issues Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * re-added this file to avoid data file import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated gramamr according to last PR. Removed the acceptance of 千 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated according to last PR. Removed comma after decimal points Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for Fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for money and updated according to last PR. Plus process of 元 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar. updates due to the updates in cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * arrangements Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added whitelist grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * word grammar for non-classified items Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated cardinal, decimal, time, itn data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to last PR Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to the updates for cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for more Mandarin punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR. removing am pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * adjustment on the weight Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the targger updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the time tagger Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to changes in tagger on am and pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * verbalizer for fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for mandarin grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * kept this file because using English utils results in data namin error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * merge conflict Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed import os Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed variables Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue, reccreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue recreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed codeing style/format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed coding style and format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicated graph for 毛 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unnecessary comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * unnecessary comment removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test file updated for more cases Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated with a comment explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated the file explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added Mandarin as zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing for dplication Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused NEMO objects Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix test file failures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix file failtures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failture Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adap to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adapt to grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adopt to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixing pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed // for zhtn/itn cache Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Update inverse_normalize.py Added zh as a selection to pass Jenkins checks. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated pynini_export.py file to create far files (#88) Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * readd Swedish (#87) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn 0712 (#89) * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and fixings according to document on natonal gideline Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fraction updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * money updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * punctuation grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * time gramamr updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * tokenizaer updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates on certificate Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * data updated and added due to updates and chanegs to the existing grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * cardinal updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * date grammar changed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test data added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test python file edits Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for tn1.0 and previous tn grammar from contribution Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test cases updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fixed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * dates updated for init files Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated the date for zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added back the itn tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added back measure and math from previou TN Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated for tests reruns Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated weights Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn char (#95) * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code stle Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * audio-based TN fix for empty pred_text/text (#92) * fix for empty pred_text Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unittests Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix pytest Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * pip 1.2.0 Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * French tn (#91) * add tests for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add fr tn for cardinals, decimals, fractions and ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * delete it far files from tools Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add languages to run_evaluate Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * remove ambiguous spacing Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * enable sh testing for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile cache date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix test for ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update tn cache for fr Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * resolve codeql issues Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Add whitelist_tech.tsv (#96) Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zhitn 0727 (#93) * updates on itn grammar to pass sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats for sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates fro sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fix Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for coding style and sparrowhawk test Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated classes for tests on whitelist and word grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for tests on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_word.py Removed unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Removed imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Removing imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile changed zh cache to 07-27-23 as it is the latest update. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Es tn romans fix (#98) * fix es tn roman exceptions Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update eval script for ITN Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * codeql fix Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Change docker image (#102) Change docker image to one including sparrowhawk Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Print warning instead exception (#97) * raise text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * text arg Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * Failed text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * add logger Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * logger Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * NeMo-text-processing Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * info level Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Normalizer.select_verbalizer Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Exception Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * restart ci Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * warning regardless of verbose flag (#107) * warning Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * self.verbose Signed-off-by: Nikolay Karpov <karpnv@gmail.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Unpin setuptools (#106) Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixed warnings: File is not always closes. (#113) Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix bug #111 (ar currencies) (#117) * fix bug #111 (ar currencies) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci folder Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Logging clean up + IT TN fix (#118) * fix utils and it TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix logging Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * add IT TN to CI Signed-off-by: Evelina <ebakhturina@nvidia.com> * update patch Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Time_IT_TN (#105) * add time verbalizer Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add time tagger and verba Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add pytest time Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeQL Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix numbers with eight Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * rewrote tokenizer Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed the file and replaced it with char in 1.8 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins file update Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * to fix tn bug@ xuesong Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * tn bug Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixeds and updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adjustments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * testing commit Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating etst cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates adapting to graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated cases for SH tests Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added some sentences Signed-off-by: Alex Cui <alexcui1994@gmail.com> * test cases update Signed-off-by: Alex Cui <alexcui1994@gmail.com> * solving rebase issue, repushing changes Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving conflict Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixings according to ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixings according to the ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * notused removing Signed-off-by: Alex Cui <alexcui1994@gmail.com> * format issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * formt issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing unused files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unused files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remiving unsed files; Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added sentences as test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added senetnces as test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed commentyed out tests Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating dates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * attemps to fix bug Signed-off-by: Alex Cui <alexcui1994@gmail.com> * inprocess of fixing the bug Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixing existing issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated graph_utils, tokenize and classify, and word graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added bacl the ppostprocessor far creation Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated NEMO_NOT_ALPHA as a new variable Signed-off-by: Alex Cui <alexcui1994@gmail.com> * far files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * combiedn into measure Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing and combined to meaasure Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing, not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to fix space issue Signed-off-by: Alex Cui <ale 10000 xcui1994@gmail.com> * updates to solve the space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving sh issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving sh test issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adding anands updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * data updated for measure and whitelist Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing fraction and math part Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing preprocessor, updating measure, adding shitelist cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing processor, modification for sp test, shitelist and word Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating zh date Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * realized itn being cvommented out, adding back Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * trying to run zh tn separately because it takes long time to run Signed-off-by: Alex Cui <alexcui1994@gmail.com> * modification to ru zh tn separately Signed-off-by: Alex Cui <alexcui1994@gmail.com> * independent zh tnitn tests for more time Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adding lines to save far file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates for reducing testing time Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * for ounct graph Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing used graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * format and removing used comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing this one, not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remove unused commentss� Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Delete tools/text_processing_deployment/zh directory Removing far files. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * updates according to the github comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * punct grammar Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_cases_cardinal.txt Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Dockerfile Copied from main branch ( which included Anand's updates) Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update launch.sh Found differences in the file. Fixing it back. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Saw word ITN being commented out. Adding it back. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update money.py Found cardinal grammar not accepting suffix. Fixed it. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update Jenkinsfile Removed duplicated zh test from line 230s Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update utils.py Addressing bug raised in bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update graph_utils.py Addressing bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update measure.py Fixing code style, removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Fixing code style, removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update measure.py Removing unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update post_processing.py Removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update post_processing.py Removing unused import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update cardinal.py Deleting unused graph Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Removing import pynini Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py removing pynini import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update verbalize.py removing pynutil import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update post_processing.py removing punct graph imported Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_sparrowhawk_normalization.sh Update on test issue for Docker file locations Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_ordinal.py Fixing style. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Delete nemo_text_processing/text_normalization/zh/taggers/math_symbol.py Removing because it's not one of the semiotic classes. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Delete nemo_text_processing/text_normalization/zh/verbalizers/math_symbol.py Removing because it's not one of the semiotic classes. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile Updating Jenkins date Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Jim O’Regan <joregan@kth.se> Co-authored-by: Enno Hermann <Eginhard@users.noreply.github.com> Co-authored-by: Vitaly Lavrukhin <vitaly.lavrukhin@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Enas Albasiri <71229149+ealbasiri@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: lleaver <137942999+lleaver@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Jim O’Regan <jaoregan@tcd.ie> Co-authored-by: Giacomo Leone Maria Cavallini <72698188+GiacomoLeoneMaria@users.noreply.github.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: Nikolay Karpov <karpnv@gmail.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Peter Plantinga <plantinga.peter@proton.me> Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
BuyuanCui
added a commit
that referenced
this pull request
Sep 26, 2024
* IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix broken path for nondet whitelist (#124) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Increase weights for serial (en TN) (#128) * Increase weights for serial (en TN) Resolves https://github.com/NVIDIA/NeMo-text-processing/issues/126 Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Add tests for fix Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile cache path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile. Fix cache folder Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measures file for FR TN (#131) * add measures file Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update whitelist data Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * add fr tn tests Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh jenkins (#127) * Add SH tests to Jenkins Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkins tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add CI/CD tests for sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * docker build only if in test mode Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing variable Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix comments and remove arguments not required Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix commands not executing Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing arguments Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing quotes Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix incorrect path for tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Incorrect paths of tests and shunit2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix issues with paths as arguments to shunit Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Undo path change Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix intentional fail test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * revert redundant check for cased option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix default path in export_grammars.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add interactive option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add SH tests for cased EN ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update isort - fix precommit (#138) * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused imports Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian itn (#136) * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context for tests and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Revert "Added context for tests and fixed CodeQL errors" This reverts commit 2c804d941963c0be21d3aad07e6cd13568ab747b. Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context to some test files and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unnecessary data Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * translated a few measurements to Armenian Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * adjusted some things for better readability and maintainer support Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed one test case and some issues Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix CI (#142) * fix whitelist deployment Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * comment out tests to recreate grammars Signed-off-by: Evelina <ebakhturina@nvidia.com> * shorten test Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix jenkins Signed-off-by: Evelina <ebakhturina@nvidia.com> * cased for TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * revert debug changes Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix args default Signed-off-by: Evelina <ebakhturina@nvidia.com> * try parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * debug parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix sh tests for local SH launcher Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian TN (#137) * merged with main branch and fixed conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing some more conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixed a minor issue Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unused imports Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix: add "hy" language option for armenian Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> * added optional space for measurements after cardinals/decimals Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * added Armenian dot Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Marathi ITN (#134) * Added Marathi ITN Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adding jenkins test Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Travis Bartley <tbartley@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins fix (#150) * jenkins fix Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * missing _init_ for python Signed-off-by: Travis Bartley <tbartley@nvidia.com> * mislabled cache Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * r0.3.0 release (#151) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix text=line[text] to text=line[text_field] (#153) Signed-off-by: Sasha Meister <sasha.meister.work@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * use real string on docstring (#157) Signed-off-by: Kevin Sanders <kevin.sanders@dialpad.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh postprocess (#147) * Add support for postprocessor far in sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Choose between having a post processor or not Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update run_evaluate script for cased itn (#164) * update run_evaluate script for cased itn Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remove unused function from ar tn decimals (#165) * remove unused function from ar tn decimals Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * ZH sentence-level TN (#112) * Swedish telephone fix (#60) * port fix for telephone from swedish-itn branch Signed-off-by: Jim O'Regan <joregan@kth.se> * extend cardinal in non-deterministic mode Signed-off-by: Jim O'Regan <joregan@kth.se> * whitespace fixes Signed-off-by: Jim O'Regan <joregan@kth.se> * also fix in the verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * Update Jenkinsfile Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * log instead of print in graph_utils.py (#68) Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * CER estimation speedup for audio-based text normalization (#73) * Replaced jiwer with editdistance to speed up CER estimation Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measure coverage for TN and ITN (#62) * add measure coverage for TN and ITN Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update measure.py Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63) * upload es-ES and fr-FR g2p dicts Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add inits Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add NALA Spanish dict Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * rename Spanish and French dictionaries Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add Italian dictionary Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add country codes from hu (#77) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix electronic case for username (#75) * fix electronic username w/o . Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix ar test Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci dirs, enable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * 0.1.8 release (#79) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Codeswitched ES/EN ITN (#78) * Initial commit for ES-EN codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable export for es_en codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add whitelist, update weights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add tests for en_es, zone tagged separately in es Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix path to test data for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkinsfile - enable ES/EN tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add __init__.py files Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix issues with failed docker build - due to archiving of debian and issues with re2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove unused imports and variables Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update date Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable NBSP in sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update copyrights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache path in for ES/EN CI/CD Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * minor normalize.py edit for usability (#84) * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * documentation edits for grammar/clarity Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * added --output_field flag for command line interface Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Swedish ITN (#40) * force two digits for month Signed-off-by: Jim O'Regan <joregan@kth.se> * put it in a function, because I reject the garbage pre-commit.ci came up with Signed-off-by: Jim O'Regan <joregan@kth.se> * wrap some more pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * add graph pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * delete junk Signed-off-by: Jim O'Regan <joregan@kth.se> * my copyright Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser (copy from es) Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * add right tokens Signed-off-by: Jim O'Regan <joregan@kth.se> * some tweaks, more needed Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to ITN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * moved to tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * nothing actually fixed here Signed-off-by: Jim O'Regan <joregan@kth.se> * now most tests pass Signed-off-by: Jim O'Regan <joregan@kth.se> * electronic Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fractions Signed-off-by: Jim O'Regan <joregan@kth.se> * extend Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bare fractions is a bit of an overreach Signed-off-by: Jim O'Regan <joregan@kth.se> * whitelist Signed-off-by: Jim O'Regan <joregan@kth.se> * just inverting the TN whitelist tagger will not work/be useful Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from English Signed-off-by: Jim O'Regan <joregan@kth.se> * overwrite with version from en Signed-off-by: Jim O'Regan <joregan@kth.se> * add basic test case Signed-off-by: Jim O'Regan <joregan@kth.se> * fix call Signed-off-by: Jim O'Regan <joregan@kth.se> * swap tsv sides Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add optional_era variable Signed-off-by: Jim O'Regan <joregan@kth.se> * add test case Signed-off-by: Jim O'Regan <joregan@kth.se> * make deterministic default, like most of the others Signed-off-by: Jim O'Regan <joregan@kth.se> * also add lowercase versions Signed-off-by: Jim O'Regan <joregan@kth.se> * replacing NEMO_SPACE does not work either Signed-off-by: Jim O'Regan <joregan@kth.se> * increasing weight... did not work last time Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaking test cases, in case it was a sentence splitting issue. It was not Signed-off-by: Jim O'Regan <joregan@kth.se> * put the full stops back Signed-off-by: Jim O'Regan <joregan@kth.se> * add filler words Signed-off-by: Jim O'Regan <joregan@kth.se> * try splitting this out to see if it makes a difference Signed-off-by: Jim O'Regan <joregan@kth.se> * aha, this part should be non-deterministic only Signed-off-by: Jim O'Regan <joregan@kth.se> * single line only Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "increasing weight... did not work last time" This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996. Signed-off-by: Jim O'Regan <joregan@kth.se> * disabling ITN here makes TN work again(?) Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "disabling ITN here makes TN work again(?)" This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f. Signed-off-by: Jim O'Regan <joregan@kth.se> * changing the variable name fixes norm tests Signed-off-by: Jim O'Regan <joregan@kth.se> * change the variable names Signed-off-by: Jim O'Regan <joregan@kth.se> * add missing test tooling Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * add a piece for area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add country codes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * extend any_read_digit for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * country/area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * first attempt Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * remove country codes for the time being, makes things ambiguous Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove trailing whitespace Signed-off-by: Jim O'Regan <joregan@kth.se> * Update __init__.py Signed-off-by: Jim O’Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transform of TN tests Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transformation of TN decimal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * slight changes to date Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * include space Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen was not that Signed-off-by: Jim O'Regan <joregan@kth.se> * add functions from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * respect my own copyright xD Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading, this has been an oddity before Signed-off-by: Jim O'Regan <joregan@kth.se> * try changing this year declaration Signed-off-by: Jim O'Regan <joregan@kth.se> * add year + era Signed-off-by: Jim O'Regan <joregan@kth.se> * eliminate more module-level data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "eliminate more module-level data loading" This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a. Signed-off-by: Jim O'Regan <joregan@kth.se> * expose variables Signed-off-by: Jim O'Regan <joregan@kth.se> * extra param for itn mode Signed-off-by: Jim O'Regan <joregan@kth.se> * change call Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * fix parens Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapt/extend tests Signed-off-by: Jim O'Regan <joregan@kth.se> * fix dict init/change keys to something useful Signed-off-by: Jim O'Regan <joregan@kth.se> * initial stab at prefixed numbers Signed-off-by: Jim O'Regan <joregan@kth.se> * some adapting Signed-off-by: Jim O'Regan <joregan@kth.se> * insert kl. if absent Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comments Signed-off-by: Jim O'Regan <joregan@kth.se> * the relative prefixed times Signed-off-by: Jim O'Regan <joregan@kth.se> * + comments Signed-off-by: Jim O'Regan <joregan@kth.se> * enable time Signed-off-by: Jim O'Regan <joregan@kth.se> * space in both directions Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix hours to Signed-off-by: Jim O'Regan <joregan@kth.se> * split by before/after Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * fix if Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. 9 Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from en Signed-off-by: Jim O'Regan <joregan@kth.se> * keep only get_abs_path Signed-off-by: Jim O'Regan <joregan@kth.se> * imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add trimmed file Signed-off-by: Jim O'Regan <joregan@kth.se> * fix imports Signed-off-by: Jim O'Regan <joregan@kth.se> * two abs_paths... could be fun Signed-off-by: Jim O'Regan <joregan@kth.se> * minutes/seconds Signed-off-by: Jim O'Regan <joregan@kth.se> * suffix Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * one optional Signed-off-by: Jim O'Regan <joregan@kth.se> * export variable Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. or one of suffix/zone Signed-off-by: Jim O'Regan <joregan@kth.se> * already disambiguated Signed-off-by: Jim O'Regan <joregan@kth.se> * closure Signed-off-by: Jim O'Regan <joregan@kth.se> * do not insert kl. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix test case Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix spelling Signed-off-by: Jim O'Regan <joregan@kth.se> * Delete measure.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Delete money.py Signed-off-by: Jim O’Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused test pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from es Signed-off-by: Jim O'Regan <joregan@kth.se> * add SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add/update __init__ Signed-off-by: Jim O'Regan <joregan@kth.se> * blank line Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix lang Signed-off-by: Jim O'Regan <joregan@kth.se> * fix decimal verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * remove year, conflicts with cardinal Signed-off-by: Jim O'Regan <joregan@kth.se> * space before, not after Signed-off-by: Jim O'Regan <joregan@kth.se> * fix cardinal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * spurious deletion Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * re-enable SV TN; enable SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "re-enable SV TN; enable SV ITN" This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix singulras Signed-off-by: Jim O'Regan <joregan@kth.se> * add an export Signed-off-by: Jim O'Regan <joregan@kth.se> * change integer graph Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move spaces Signed-off-by: Jim O'Regan <joregan@kth.se> * use cdrewrite Signed-off-by: Jim O'Regan <joregan@kth.se> * just EOS/BOS Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <joregan@kth.se> * omit en/ett, because they are also articles Signed-off-by: Jim O'Regan <joregan@kth.se> * uncomment Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused Signed-off-by: Jim O'Regan <joregan@kth.se> * strip spaces from decimal part Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * partial fix, not what I wanted Signed-off-by: Jim O'Regan <joregan@kth.se> * move comment Signed-off-by: Jim O'Regan <joregan@kth.se> * en/ett cannot work in itn case Signed-off-by: Jim O'Regan <joregan@kth.se> * be more deliberate in graph construction Signed-off-by: Jim O'Regan <joregan@kth.se> * accept both Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * +2 tests Signed-off-by: Jim O'Regan <joregan@kth.se> * (try to) accept singular quantities for plurals Signed-off-by: Jim O'Regan <joregan@kth.se> * retry Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * oops Signed-off-by: Jim O'Regan <joregan@kth.se> * replace Signed-off-by: Jim O'Regan <joregan@kth.se> * arcmap Signed-off-by: Jim O'Regan <joregan@kth.se> * version without ones Signed-off-by: Jim O'Regan <joregan@kth.se> * add another test Signed-off-by: Jim O'Regan <joregan@kth.se> * change graph Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of this, this is where it goes wrong Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * add a test Signed-off-by: Jim O'Regan <joregan@kth.se> * multiple states from both ones, try removing and readding Signed-off-by: Jim O'Regan <joregan@kth.se> * remove ones, see if that fixes at least the bare quantities Signed-off-by: Jim O'Regan <joregan@kth.se> * works in the repl, dunno why it still breaks Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * move definition Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * another test Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * local declaration, seems to not be working Signed-off-by: Jim O'Regan <joregan@kth.se> * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * match verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix last two failing tests Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing tests for telephone and word Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused variable Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of convert_space, tests fail Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "put convert_spaces back, change test file; pytest fails" This reverts commit a7bb7489137b8026aab02aff64df39e874630043. Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails, take 2 Signed-off-by: Jim O'Regan <joregan@kth.se> * deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk Signed-off-by: Jim O'Regan <joregan@kth.se> * try converting the non-breaking spaces in the shell script Signed-off-by: Jim O'Regan <joregan@kth.se> * wrong place Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * fix path Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused Signed-off-by: Jim O'Regan <joregan@kth.se> * Update date.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Update time.py Signed-off-by: Jim O’Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix comment Signed-off-by: Jim O’Regan <joregan@kth.se> * trim comments Signed-off-by: Jim O’Regan <joregan@kth.se> * remove commented line Signed-off-by: Jim O’Regan <joregan@kth.se> * en halv Signed-off-by: Jim O’Regan <joregan@kth.se> * Update test_sparrowhawk_inverse_text_normalization.sh Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Italian_TN (#67) * add TN italian Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix init Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix LOCATION Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * modify graph_utils Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * correct decimals Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix measure Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh itn (#74) * Add ZH ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix copyrights and code cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove invalid tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve CodeQL issues Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing 'zh' option for ITN and correct comment Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update __init__.py Change to zh instead of en for the imports. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for decimal test data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * update for langauge import Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for Chinese punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * a new class for whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * PYNINI_AVAILABLE = False Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to file import format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * caught duplicates, removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates, arranges for CHInese Yuan updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates accordingly to the comments from last PR. Recreated some of the files due to format issues Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * re-added this file to avoid data file import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated gramamr according to last PR. Removed the acceptance of 千 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated according to last PR. Removed comma after decimal points Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for Fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for money and updated according to last PR. Plus process of 元 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar. updates due to the updates in cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * arrangements Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added whitelist grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * word grammar for non-classified items Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated cardinal, decimal, time, itn data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to last PR Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to the updates for cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for more Mandarin punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR. removing am pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * adjustment on the weight Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the targger updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the time tagger Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to changes in tagger on am and pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * verbalizer for fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for mandarin grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * kept this file because using English utils results in data namin error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * merge conflict Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed import os Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed variables Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue, reccreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue recreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed codeing style/format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed coding style and format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicated graph for 毛 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unnecessary comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * unnecessary comment removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test file updated for more cases Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated with a comment explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated the file explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added Mandarin as zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing for dplication Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused NEMO objects Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix test file failures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix file failtures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failture Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adap to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adapt to grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adopt to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixing pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed // for zhtn/itn cache Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Update inverse_normalize.py Added zh as a selection to pass Jenkins checks. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated pynini_export.py file to create far files (#88) Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * readd Swedish (#87) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn 0712 (#89) * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and fixings according to document on natonal gideline Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fraction updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * money updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * punctuation grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * time gramamr updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * tokenizaer updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates on certificate Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * data updated and added due to updates and chanegs to the existing grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * cardinal updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * date grammar changed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test data added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test python file edits Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for tn1.0 and previous tn grammar from contribution Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test cases updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fixed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * dates updated for init files Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated the date for zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added back the itn tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added back measure and math from previou TN Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated for tests reruns Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated weights Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn char (#95) * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code stle Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * audio-based TN fix for empty pred_text/text (#92) * fix for empty pred_text Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unittests Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix pytest Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * pip 1.2.0 Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * French tn (#91) * add tests for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add fr tn for cardinals, decimals, fractions and ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * delete it far files from tools Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add languages to run_evaluate Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * remove ambiguous spacing Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * enable sh testing for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile cache date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix test for ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update tn cache for fr Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * resolve codeql issues Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Add whitelist_tech.tsv (#96) Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zhitn 0727 (#93) * updates on itn grammar to pass sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats for sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates fro sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fix Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for coding style and sparrowhawk test Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated classes for tests on whitelist and word grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for tests on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_word.py Removed unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Removed imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Removing imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile changed zh cache to 07-27-23 as it is the latest update. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Es tn romans fix (#98) * fix es tn roman exceptions Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update eval script for ITN Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * codeql fix Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Change docker image (#102) Change docker image to one including sparrowhawk Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Print warning instead exception (#97) * raise text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * text arg Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * Failed text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * add logger Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * logger Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * NeMo-text-processing Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * info level Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Normalizer.select_verbalizer Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Exception 10000 Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * restart ci Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * warning regardless of verbose flag (#107) * warning Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * self.verbose Signed-off-by: Nikolay Karpov <karpnv@gmail.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Unpin setuptools (#106) Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixed warnings: File is not always closes. (#113) Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix bug #111 (ar currencies) (#117) * fix bug #111 (ar currencies) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci folder Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Logging clean up + IT TN fix (#118) * fix utils and it TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix logging Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * add IT TN to CI Signed-off-by: Evelina <ebakhturina@nvidia.com> * update patch Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Time_IT_TN (#105) * add time verbalizer Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add time tagger and verba Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add pytest time Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeQL Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix numbers with eight Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * rewrote tokenizer Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed the file and replaced it with char in 1.8 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins file update Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * to fix tn bug@ xuesong Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * tn bug Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixeds and updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adjustments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * testing commit Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating etst cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates adapting to graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated …
BuyuanCui
added a commit
that referenced
this pull request
Oct 16, 2024
* Swedish telephone fix (#60) * port fix for telephone from swedish-itn branch Signed-off-by: Jim O'Regan <joregan@kth.se> * extend cardinal in non-deterministic mode Signed-off-by: Jim O'Regan <joregan@kth.se> * whitespace fixes Signed-off-by: Jim O'Regan <joregan@kth.se> * also fix in the verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * Update Jenkinsfile Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * log instead of print in graph_utils.py (#68) Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * CER estimation speedup for audio-based text normalization (#73) * Replaced jiwer with editdistance to speed up CER estimation Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measure coverage for TN and ITN (#62) * add measure coverage for TN and ITN Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update measure.py Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63) * upload es-ES and fr-FR g2p dicts Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add inits Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add NALA Spanish dict Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * rename Spanish and French dictionaries Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add Italian dictionary Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add country codes from hu (#77) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix electronic case for username (#75) * fix electronic username w/o . Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix ar test Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci dirs, enable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * 0.1.8 release (#79) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Codeswitched ES/EN ITN (#78) * Initial commit for ES-EN codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable export for es_en codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add whitelist, update weights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add tests for en_es, zone tagged separately in es Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix path to test data for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkinsfile - enable ES/EN tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add __init__.py files Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix issues with failed docker build - due to archiving of debian and issues with re2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove unused imports and variables Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update date Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable NBSP in sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update copyrights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache path in for ES/EN CI/CD Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * minor normalize.py edit for usability (#84) * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * documentation edits for grammar/clarity Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * added --output_field flag for command line interface Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Swedish ITN (#40) * force two digits for month Signed-off-by: Jim O'Regan <joregan@kth.se> * put it in a function, because I reject the garbage pre-commit.ci came up with Signed-off-by: Jim O'Regan <joregan@kth.se> * wrap some more pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * add graph pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * delete junk Signed-off-by: Jim O'Regan <joregan@kth.se> * my copyright Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser (copy from es) Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * add right tokens Signed-off-by: Jim O'Regan <joregan@kth.se> * some tweaks, more needed Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to ITN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * moved to tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * nothing actually fixed here Signed-off-by: Jim O'Regan <joregan@kth.se> * now most tests pass Signed-off-by: Jim O'Regan <joregan@kth.se> * electronic Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fractions Signed-off-by: Jim O'Regan <joregan@kth.se> * extend Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bare fractions is a bit of an overreach Signed-off-by: Jim O'Regan <joregan@kth.se> * whitelist Signed-off-by: Jim O'Regan <joregan@kth.se> * just inverting the TN whitelist tagger will not work/be useful Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from English Signed-off-by: Jim O'Regan <joregan@kth.se> * overwrite with version from en Signed-off-by: Jim O'Regan <joregan@kth.se> * add basic test case Signed-off-by: Jim O'Regan <joregan@kth.se> * fix call Signed-off-by: Jim O'Regan <joregan@kth.se> * swap tsv sides Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add optional_era variable Signed-off-by: Jim O'Regan <joregan@kth.se> * add test case Signed-off-by: Jim O'Regan <joregan@kth.se> * make deterministic default, like most of the others Signed-off-by: Jim O'Regan <joregan@kth.se> * also add lowercase versions Signed-off-by: Jim O'Regan <joregan@kth.se> * replacing NEMO_SPACE does not work either Signed-off-by: Jim O'Regan <joregan@kth.se> * increasing weight... did not work last time Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaking test cases, in case it was a sentence splitting issue. It was not Signed-off-by: Jim O'Regan <joregan@kth.se> * put the full stops back Signed-off-by: Jim O'Regan <joregan@kth.se> * add filler words Signed-off-by: Jim O'Regan <joregan@kth.se> * try splitting this out to see if it makes a difference Signed-off-by: Jim O'Regan <joregan@kth.se> * aha, this part should be non-deterministic only Signed-off-by: Jim O'Regan <joregan@kth.se> * single line only Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "increasing weight... did not work last time" This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996. Signed-off-by: Jim O'Regan <joregan@kth.se> * disabling ITN here makes TN work again(?) Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "disabling ITN here makes TN work again(?)" This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f. Signed-off-by: Jim O'Regan <joregan@kth.se> * changing the variable name fixes norm tests Signed-off-by: Jim O'Regan <joregan@kth.se> * change the variable names Signed-off-by: Jim O'Regan <joregan@kth.se> * add missing test tooling Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * add a piece for area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add country codes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * extend any_read_digit for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * country/area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * first attempt Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * remove country codes for the time being, makes things ambiguous Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove trailing whitespace Signed-off-by: Jim O'Regan <joregan@kth.se> * Update __init__.py Signed-off-by: Jim O’Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transform of TN tests Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transformation of TN decimal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * slight changes to date Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * include space Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen was not that Signed-off-by: Jim O'Regan <joregan@kth.se> * add functions from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * respect my own copyright xD Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading, this has been an oddity before Signed-off-by: Jim O'Regan <joregan@kth.se> * try changing this year declaration Signed-off-by: Jim O'Regan <joregan@kth.se> * add year + era Signed-off-by: Jim O'Regan <joregan@kth.se> * eliminate more module-level data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "eliminate more module-level data loading" This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a. Signed-off-by: Jim O'Regan <joregan@kth.se> * expose variables Signed-off-by: Jim O'Regan <joregan@kth.se> * extra param for itn mode Signed-off-by: Jim O'Regan <joregan@kth.se> * change call Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * fix parens Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapt/extend tests Signed-off-by: Jim O'Regan <joregan@kth.se> * fix dict init/change keys to something useful Signed-off-by: Jim O'Regan <joregan@kth.se> * initial stab at prefixed numbers Signed-off-by: Jim O'Regan <joregan@kth.se> * some adapting Signed-off-by: Jim O'Regan <joregan@kth.se> * insert kl. if absent Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comments Signed-off-by: Jim O'Regan <joregan@kth.se> * the relative prefixed times Signed-off-by: Jim O'Regan <joregan@kth.se> * + comments Signed-off-by: Jim O'Regan <joregan@kth.se> * enable time Signed-off-by: Jim O'Regan <joregan@kth.se> * space in both directions Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix hours to Signed-off-by: Jim O'Regan <joregan@kth.se> * split by before/after Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * fix if Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. 9 Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from en Signed-off-by: Jim O'Regan <joregan@kth.se> * keep only get_abs_path Signed-off-by: Jim O'Regan <joregan@kth.se> * imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add trimmed file Signed-off-by: Jim O'Regan <joregan@kth.se> * fix imports Signed-off-by: Jim O'Regan <joregan@kth.se> * two abs_paths... could be fun Signed-off-by: Jim O'Regan <joregan@kth.se> * minutes/seconds Signed-off-by: Jim O'Regan <joregan@kth.se> * suffix Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * one optional Signed-off-by: Jim O'Regan <joregan@kth.se> * export variable Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. or one of suffix/zone Signed-off-by: Jim O'Regan <joregan@kth.se> * already disambiguated Signed-off-by: Jim O'Regan <joregan@kth.se> * closure Signed-off-by: Jim O'Regan <joregan@kth.se> * do not insert kl. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix test case Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix spelling Signed-off-by: Jim O'Regan <joregan@kth.se> * Delete measure.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Delete money.py Signed-off-by: Jim O’Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused test pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from es Signed-off-by: Jim O'Regan <joregan@kth.se> * add SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add/update __init__ Signed-off-by: Jim O'Regan <joregan@kth.se> * blank line Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix lang Signed-off-by: Jim O'Regan <joregan@kth.se> * fix decimal verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * remove year, conflicts with cardinal Signed-off-by: Jim O'Regan <joregan@kth.se> * space before, not after Signed-off-by: Jim O'Regan <joregan@kth.se> * fix cardinal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * spurious deletion Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * re-enable SV TN; enable SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "re-enable SV TN; enable SV ITN" This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix singulras Signed-off-by: Jim O'Regan <joregan@kth.se> * add an export Signed-off-by: Jim O'Regan <joregan@kth.se> * change integer graph Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move spaces Signed-off-by: Jim O'Regan <joregan@kth.se> * use cdrewrite Signed-off-by: Jim O'Regan <joregan@kth.se> * just EOS/BOS Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <joregan@kth.se> * omit en/ett, because they are also articles Signed-off-by: Jim O'Regan <joregan@kth.se> * uncomment Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused Signed-off-by: Jim O'Regan <joregan@kth.se> * strip spaces from decimal part Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * partial fix, not what I wanted Signed-off-by: Jim O'Regan <joregan@kth.se> * move comment Signed-off-by: Jim O'Regan <joregan@kth.se> * en/ett cannot work in itn case Signed-off-by: Jim O'Regan <joregan@kth.se> * be more deliberate in graph construction Signed-off-by: Jim O'Regan <joregan@kth.se> * accept both Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * +2 tests Signed-off-by: Jim O'Regan <joregan@kth.se> * (try to) accept singular quantities for plurals Signed-off-by: Jim O'Regan <joregan@kth.se> * retry Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * oops Signed-off-by: Jim O'Regan <joregan@kth.se> * replace Signed-off-by: Jim O'Regan <joregan@kth.se> * arcmap Signed-off-by: Jim O'Regan <joregan@kth.se> * version without ones Signed-off-by: Jim O'Regan <joregan@kth.se> * add another test Signed-off-by: Jim O'Regan <joregan@kth.se> * change graph Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of this, this is where it goes wrong Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * add a test Signed-off-by: Jim O'Regan <joregan@kth.se> * multiple states from both ones, try removing and readding Signed-off-by: Jim O'Regan <joregan@kth.se> * remove ones, see if that fixes at least the bare quantities Signed-off-by: Jim O'Regan <joregan@kth.se> * works in the repl, dunno why it still breaks Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * move definition Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * another test Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * local declaration, seems to not be working Signed-off-by: Jim O'Regan <joregan@kth.se> * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * match verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix last two failing tests Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing tests for telephone and word Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused variable Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of convert_space, tests fail Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "put convert_spaces back, change test file; pytest fails" This reverts commit a7bb7489137b8026aab02aff64df39e874630043. Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails, take 2 Signed-off-by: Jim O'Regan <joregan@kth.se> * deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk Signed-off-by: Jim O'Regan <joregan@kth.se> * try converting the non-breaking spaces in the shell script Signed-off-by: Jim O'Regan <joregan@kth.se> * wrong place Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * fix path Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused Signed-off-by: Jim O'Regan <joregan@kth.se> * Update date.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Update time.py Signed-off-by: Jim O’Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix comment Signed-off-by: Jim O’Regan <joregan@kth.se> * trim comments Signed-off-by: Jim O’Regan <joregan@kth.se> * remove commented line Signed-off-by: Jim O’Regan <joregan@kth.se> * en halv Signed-off-by: Jim O’Regan <joregan@kth.se> * Update test_sparrowhawk_inverse_text_normalization.sh Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Italian_TN (#67) * add TN italian Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix init Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix LOCATION Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * modify graph_utils Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * correct decimals Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix measure Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh itn (#74) * Add ZH ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix copyrights and code cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove invalid tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve CodeQL issues Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing 'zh' option for ITN and correct comment Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update __init__.py Change to zh instead of en for the imports. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for decimal test data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * update for langauge import Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for Chinese punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * a new class for whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * PYNINI_AVAILABLE = False Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to file import format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * caught duplicates, removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates, arranges for CHInese Yuan updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates accordingly to the comments from last PR. Recreated some of the files due to format issues Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * re-added this file to avoid data file import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated gramamr according to last PR. Removed the acceptance of 千 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated according to last PR. Removed comma after decimal points Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for Fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for money and updated according to last PR. Plus process of 元 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar. updates due to the updates in cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * arrangements Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added whitelist grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * word grammar for non-classified items Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated cardinal, decimal, time, itn data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to last PR Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to the updates for cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for more Mandarin punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR. removing am pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * adjustment on the weight Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the targger updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the time tagger Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to changes in tagger on am and pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * verbalizer for fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for mandarin grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * kept this file because using English utils results in data namin error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * merge conflict Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed import os Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed variables Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue, reccreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue recreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed codeing style/format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed coding style and format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicated graph for 毛 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unnecessary comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * unnecessary comment removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test file updated for more cases Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated with a comment explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated the file explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added Mandarin as zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing for dplication Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused NEMO objects Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix test file failures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix file failtures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failture Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adap to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adapt to grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adopt to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixing pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed // for zhtn/itn cache Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Update inverse_normalize.py Added zh as a selection to pass Jenkins checks. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated pynini_export.py file to create far files (#88) Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * readd Swedish (#87) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn 0712 (#89) * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and fixings according to document on natonal gideline Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fraction updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * money updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * punctuation grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * time gramamr updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * tokenizaer updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates on certificate Signed-off-by: BuyuanCui <alexcui1994@gmail 10000 .com> * data updated and added due to updates and chanegs to the existing grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * cardinal updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * date grammar changed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test data added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test python file edits Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for tn1.0 and previous tn grammar from contribution Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test cases updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fixed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * dates updated for init files Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated the date for zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added back the itn tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added back measure and math from previou TN Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated for tests reruns Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated weights Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn char (#95) * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code stle Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * audio-based TN fix for empty pred_text/text (#92) * fix for empty pred_text Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unittests Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix pytest Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * pip 1.2.0 Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * French tn (#91) * add tests for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add fr tn for cardinals, decimals, fractions and ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * delete it far files from tools Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add languages to run_evaluate Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * remove ambiguous spacing Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * enable sh testing for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile cache date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix test for ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update tn cache for fr Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * resolve codeql issues Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Add whitelist_tech.tsv (#96) Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zhitn 0727 (#93) * updates on itn grammar to pass sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats for sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates fro sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fix Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for coding style and sparrowhawk test Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated classes for tests on whitelist and word grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for tests on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_word.py Removed unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Removed imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Removing imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile changed zh cache to 07-27-23 as it is the latest update. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Es tn romans fix (#98) * fix es tn roman exceptions Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update eval script for ITN Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * codeql fix Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Change docker image (#102) Change docker image to one including sparrowhawk Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Print warning instead exception (#97) * raise text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * text arg Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * Failed text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * add logger Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * logger Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * NeMo-text-processing Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * info level Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Normalizer.select_verbalizer Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Exception Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * restart ci Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * warning regardless of verbose flag (#107) * warning Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * self.verbose Signed-off-by: Nikolay Karpov <karpnv@gmail.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Unpin setuptools (#106) Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixed warnings: File is not always closes. (#113) Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix bug #111 (ar currencies) (#117) * fix bug #111 (ar currencies) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci folder Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Logging clean up + IT TN fix (#118) * fix utils and it TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix logging Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * add IT TN to CI Signed-off-by: Evelina <ebakhturina@nvidia.com> * update patch Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Time_IT_TN (#105) * add time verbalizer Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add time tagger and verba Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add pytest time Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeQL Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix numbers with eight Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * rewrote tokenizer Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed the file and replaced it with char in 1.8 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins file update Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * to fix tn bug@ xuesong Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * tn bug Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixeds and updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adjustments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * testing commit Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating etst cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates adapting to graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated cases for SH tests Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added some sentences Signed-off-by: Alex Cui <alexcui1994@gmail.com> * test cases update Signed-off-by: Alex Cui <alexcui1994@gmail.com> * solving rebase issue, repushing changes Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving conflict Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixings according to ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixings according to the ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * notused removing Signed-off-by: Alex Cui <alexcui1994@gmail.com> * format issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * formt issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing unused files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unused files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remiving unsed files; Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added sentences as test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added senetnces as test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed commentyed out tests Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating dates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * attemps to fix bug Signed-off-by: Alex Cui <alexcui1994@gmail.com> * inprocess of fixing the bug Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixing existing issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated graph_utils, tokenize and classify, and word graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * added bacl the ppostprocessor far creation Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated NEMO_NOT_ALPHA as a new variable Signed-off-by: Alex Cui <alexcui1994@gmail.com> * far files Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * combiedn into measure Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing and combined to meaasure Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing, not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to fix space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates to solve the space issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving sh issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * resolving sh test issue Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adding anands updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * data updated for measure and whitelist Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing fraction and math part Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing preprocessor, updating measure, adding shitelist cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing processor, modification for sp test, shitelist and word Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating zh date Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * realized itn being cvommented out, adding back Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * trying to run zh tn separately because it takes long time to run Signed-off-by: Alex Cui <alexcui1994@gmail.com> * modification to ru zh tn separately Signed-off-by: Alex Cui <alexcui1994@gmail.com> * independent zh tnitn tests for more time Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adding lines to save far file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates for reducing testing time Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * for ounct graph Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing used graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * format and removing used comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing this one, not used Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remove unused commentss� Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Delete tools/text_processing_deployment/zh directory Removing far files. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * updates according to the github comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing comments Signed-off-by: Alex Cui <alexcui1994@gmail.com> * punct grammar Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_cases_cardinal.txt Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Dockerfile Copied from main branch ( which included Anand's updates) Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update launch.sh Found differences in the file. Fixing it back. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Saw word ITN being commented out. Adding it back. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update money.py Found cardinal grammar not accepting suffix. Fixed it. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update Jenkinsfile Removed duplicated zh test from line 230s Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update utils.py Addressing bug raised in bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update graph_utils.py Addressing bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update measure.py Fixing code style, removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Fixing code style, removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update measure.py Removing unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update post_processing.py Removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update post_processing.py Removing unused import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Removing unused imports Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update cardinal.py Deleting unused graph Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py Removing import pynini Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update word.py removing pynini import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update verbalize.py removing pynutil import Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update post_processing.py removing punct graph imported Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_sparrowhawk_normalization.sh Update on test issue for Docker file locations Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_ordinal.py Fixing style. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Delete nemo_text_processing/text_normalization/zh/taggers/math_symbol.py Removing because it's not one of the semiotic classes. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Delete nemo_text_processing/text_normalization/zh/verbalizers/math_symbol.py Removing because it's not one of the semiotic classes. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile Updating Jenkins date Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Jim O’Regan <joregan@kth.se> Co-authored-by: Enno Hermann <Eginhard@users.noreply.github.com> Co-authored-by: Vitaly Lavrukhin <vitaly.lavrukhin@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Enas Albasiri <71229149+ealbasiri@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: lleaver <137942999+lleaver@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Jim O’Regan <jaoregan@tcd.ie> Co-authored-by: Giacomo Leone Maria Cavallini <72698188+GiacomoLeoneMaria@users.noreply.github.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: Nikolay Karpov <karpnv@gmail.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Peter Plantinga <plantinga.peter@proton.me> Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
BuyuanCui
added a commit
that referenced
this pull request
Oct 16, 2024
* IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix broken path for nondet whitelist (#124) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Increase weights for serial (en TN) (#128) * Increase weights for serial (en TN) Resolves https://github.com/NVIDIA/NeMo-text-processing/issues/126 Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Add tests for fix Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile cache path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile. Fix cache folder Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measures file for FR TN (#131) * add measures file Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update whitelist data Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * add fr tn tests Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh jenkins (#127) * Add SH tests to Jenkins Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkins tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add CI/CD tests for sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * docker build only if in test mode Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing variable Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix comments and remove arguments not required Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix commands not executing Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing arguments Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing quotes Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix incorrect path for tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Incorrect paths of tests and shunit2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix issues with paths as arguments to shunit Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Undo path change Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix intentional fail test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * revert redundant check for cased option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix default path in export_grammars.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add interactive option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add SH tests for cased EN ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update isort - fix precommit (#138) * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused imports Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian itn (#136) * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context for tests and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Revert "Added context for tests and fixed CodeQL errors" This reverts commit 2c804d941963c0be21d3aad07e6cd13568ab747b. Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context to some test files and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unnecessary data Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * translated a few measurements to Armenian Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * adjusted some things for better readability and maintainer support Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed one test case and some issues Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix CI (#142) * fix whitelist deployment Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * comment out tests to recreate grammars Signed-off-by: Evelina <ebakhturina@nvidia.com> * shorten test Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix jenkins Signed-off-by: Evelina <ebakhturina@nvidia.com> * cased for TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * revert debug changes Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix args default Signed-off-by: Evelina <ebakhturina@nvidia.com> * try parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * debug parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix sh tests for local SH launcher Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian TN (#137) * merged with main branch and fixed conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing some more conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixed a minor issue Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unused imports Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix: add "hy" language option for armenian Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> * added optional space for measurements after cardinals/decimals Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * added Armenian dot Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Marathi ITN (#134) 10000 * Added Marathi ITN Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adding jenkins test Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Travis Bartley <tbartley@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins fix (#150) * jenkins fix Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * missing _init_ for python Signed-off-by: Travis Bartley <tbartley@nvidia.com> * mislabled cache Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * r0.3.0 release (#151) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix text=line[text] to text=line[text_field] (#153) Signed-off-by: Sasha Meister <sasha.meister.work@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * use real string on docstring (#157) Signed-off-by: Kevin Sanders <kevin.sanders@dialpad.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh postprocess (#147) * Add support for postprocessor far in sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Choose between having a post processor or not Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update run_evaluate script for cased itn (#164) * update run_evaluate script for cased itn Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remove unused function from ar tn decimals (#165) * remove unused function from ar tn decimals Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * ZH sentence-level TN (#112) * Swedish telephone fix (#60) * port fix for telephone from swedish-itn branch Signed-off-by: Jim O'Regan <joregan@kth.se> * extend cardinal in non-deterministic mode Signed-off-by: Jim O'Regan <joregan@kth.se> * whitespace fixes Signed-off-by: Jim O'Regan <joregan@kth.se> * also fix in the verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * Update Jenkinsfile Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * log instead of print in graph_utils.py (#68) Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * CER estimation speedup for audio-based text normalization (#73) * Replaced jiwer with editdistance to speed up CER estimation Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measure coverage for TN and ITN (#62) * add measure coverage for TN and ITN Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update measure.py Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63) * upload es-ES and fr-FR g2p dicts Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add inits Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add NALA Spanish dict Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * rename Spanish and French dictionaries Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add Italian dictionary Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add country codes from hu (#77) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix electronic case for username (#75) * fix electronic username w/o . Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix ar test Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci dirs, enable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * 0.1.8 release (#79) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Codeswitched ES/EN ITN (#78) * Initial commit for ES-EN codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable export for es_en codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add whitelist, update weights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add tests for en_es, zone tagged separately in es Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix path to test data for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkinsfile - enable ES/EN tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add __init__.py files Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix issues with failed docker build - due to archiving of debian and issues with re2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove unused imports and variables Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update date Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable NBSP in sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update copyrights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache path in for ES/EN CI/CD Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * minor normalize.py edit for usability (#84) * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * documentation edits for grammar/clarity Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * added --output_field flag for command line interface Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Swedish ITN (#40) * force two digits for month Signed-off-by: Jim O'Regan <joregan@kth.se> * put it in a function, because I reject the garbage pre-commit.ci came up with Signed-off-by: Jim O'Regan <joregan@kth.se> * wrap some more pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * add graph pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * delete junk Signed-off-by: Jim O'Regan <joregan@kth.se> * my copyright Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser (copy from es) Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * add right tokens Signed-off-by: Jim O'Regan <joregan@kth.se> * some tweaks, more needed Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to ITN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * moved to tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * nothing actually fixed here Signed-off-by: Jim O'Regan <joregan@kth.se> * now most tests pass Signed-off-by: Jim O'Regan <joregan@kth.se> * electronic Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fractions Signed-off-by: Jim O'Regan <joregan@kth.se> * extend Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bare fractions is a bit of an overreach Signed-off-by: Jim O'Regan <joregan@kth.se> * whitelist Signed-off-by: Jim O'Regan <joregan@kth.se> * just inverting the TN whitelist tagger will not work/be useful Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from English Signed-off-by: Jim O'Regan <joregan@kth.se> * overwrite with version from en Signed-off-by: Jim O'Regan <joregan@kth.se> * add basic test case Signed-off-by: Jim O'Regan <joregan@kth.se> * fix call Signed-off-by: Jim O'Regan <joregan@kth.se> * swap tsv sides Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add optional_era variable Signed-off-by: Jim O'Regan <joregan@kth.se> * add test case Signed-off-by: Jim O'Regan <joregan@kth.se> * make deterministic default, like most of the others Signed-off-by: Jim O'Regan <joregan@kth.se> * also add lowercase versions Signed-off-by: Jim O'Regan <joregan@kth.se> * replacing NEMO_SPACE does not work either Signed-off-by: Jim O'Regan <joregan@kth.se> * increasing weight... did not work last time Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaking test cases, in case it was a sentence splitting issue. It was not Signed-off-by: Jim O'Regan <joregan@kth.se> * put the full stops back Signed-off-by: Jim O'Regan <joregan@kth.se> * add filler words Signed-off-by: Jim O'Regan <joregan@kth.se> * try splitting this out to see if it makes a difference Signed-off-by: Jim O'Regan <joregan@kth.se> * aha, this part should be non-deterministic only Signed-off-by: Jim O'Regan <joregan@kth.se> * single line only Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "increasing weight... did not work last time" This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996. Signed-off-by: Jim O'Regan <joregan@kth.se> * disabling ITN here makes TN work again(?) Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "disabling ITN here makes TN work again(?)" This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f. Signed-off-by: Jim O'Regan <joregan@kth.se> * changing the variable name fixes norm tests Signed-off-by: Jim O'Regan <joregan@kth.se> * change the variable names Signed-off-by: Jim O'Regan <joregan@kth.se> * add missing test tooling Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * add a piece for area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add country codes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * extend any_read_digit for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * country/area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * first attempt Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * remove country codes for the time being, makes things ambiguous Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove trailing whitespace Signed-off-by: Jim O'Regan <joregan@kth.se> * Update __init__.py Signed-off-by: Jim O’Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transform of TN tests Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transformation of TN decimal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * slight changes to date Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * include space Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen was not that Signed-off-by: Jim O'Regan <joregan@kth.se> * add functions from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * respect my own copyright xD Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading, this has been an oddity before Signed-off-by: Jim O'Regan <joregan@kth.se> * try changing this year declaration Signed-off-by: Jim O'Regan <joregan@kth.se> * add year + era Signed-off-by: Jim O'Regan <joregan@kth.se> * eliminate more module-level data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "eliminate more module-level data loading" This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a. Signed-off-by: Jim O'Regan <joregan@kth.se> * expose variables Signed-off-by: Jim O'Regan <joregan@kth.se> * extra param for itn mode Signed-off-by: Jim O'Regan <joregan@kth.se> * change call Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * fix parens Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapt/extend tests Signed-off-by: Jim O'Regan <joregan@kth.se> * fix dict init/change keys to something useful Signed-off-by: Jim O'Regan <joregan@kth.se> * initial stab at prefixed numbers Signed-off-by: Jim O'Regan <joregan@kth.se> * some adapting Signed-off-by: Jim O'Regan <joregan@kth.se> * insert kl. if absent Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comments Signed-off-by: Jim O'Regan <joregan@kth.se> * the relative prefixed times Signed-off-by: Jim O'Regan <joregan@kth.se> * + comments Signed-off-by: Jim O'Regan <joregan@kth.se> * enable time Signed-off-by: Jim O'Regan <joregan@kth.se> * space in both directions Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix hours to Signed-off-by: Jim O'Regan <joregan@kth.se> * split by before/after Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * fix if Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. 9 Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from en Signed-off-by: Jim O'Regan <joregan@kth.se> * keep only get_abs_path Signed-off-by: Jim O'Regan <joregan@kth.se> * imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add trimmed file Signed-off-by: Jim O'Regan <joregan@kth.se> * fix imports Signed-off-by: Jim O'Regan <joregan@kth.se> * two abs_paths... could be fun Signed-off-by: Jim O'Regan <joregan@kth.se> * minutes/seconds Signed-off-by: Jim O'Regan <joregan@kth.se> * suffix Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * one optional Signed-off-by: Jim O'Regan <joregan@kth.se> * export variable Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. or one of suffix/zone Signed-off-by: Jim O'Regan <joregan@kth.se> * already disambiguated Signed-off-by: Jim O'Regan <joregan@kth.se> * closure Signed-off-by: Jim O'Regan <joregan@kth.se> * do not insert kl. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix test case Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix spelling Signed-off-by: Jim O'Regan <joregan@kth.se> * Delete measure.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Delete money.py Signed-off-by: Jim O’Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused test pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from es Signed-off-by: Jim O'Regan <joregan@kth.se> * add SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add/update __init__ Signed-off-by: Jim O'Regan <joregan@kth.se> * blank line Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix lang Signed-off-by: Jim O'Regan <joregan@kth.se> * fix decimal verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * remove year, conflicts with cardinal Signed-off-by: Jim O'Regan <joregan@kth.se> * space before, not after Signed-off-by: Jim O'Regan <joregan@kth.se> * fix cardinal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * spurious deletion Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * re-enable SV TN; enable SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "re-enable SV TN; enable SV ITN" This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix singulras Signed-off-by: Jim O'Regan <joregan@kth.se> * add an export Signed-off-by: Jim O'Regan <joregan@kth.se> * change integer graph Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move spaces Signed-off-by: Jim O'Regan <joregan@kth.se> * use cdrewrite Signed-off-by: Jim O'Regan <joregan@kth.se> * just EOS/BOS Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <joregan@kth.se> * omit en/ett, because they are also articles Signed-off-by: Jim O'Regan <joregan@kth.se> * uncomment Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused Signed-off-by: Jim O'Regan <joregan@kth.se> * strip spaces from decimal part Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * partial fix, not what I wanted Signed-off-by: Jim O'Regan <joregan@kth.se> * move comment Signed-off-by: Jim O'Regan <joregan@kth.se> * en/ett cannot work in itn case Signed-off-by: Jim O'Regan <joregan@kth.se> * be more deliberate in graph construction Signed-off-by: Jim O'Regan <joregan@kth.se> * accept both Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * +2 tests Signed-off-by: Jim O'Regan <joregan@kth.se> * (try to) accept singular quantities for plurals Signed-off-by: Jim O'Regan <joregan@kth.se> * retry Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * oops Signed-off-by: Jim O'Regan <joregan@kth.se> * replace Signed-off-by: Jim O'Regan <joregan@kth.se> * arcmap Signed-off-by: Jim O'Regan <joregan@kth.se> * version without ones Signed-off-by: Jim O'Regan <joregan@kth.se> * add another test Signed-off-by: Jim O'Regan <joregan@kth.se> * change graph Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of this, this is where it goes wrong Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * add a test Signed-off-by: Jim O'Regan <joregan@kth.se> * multiple states from both ones, try removing and readding Signed-off-by: Jim O'Regan <joregan@kth.se> * remove ones, see if that fixes at least the bare quantities Signed-off-by: Jim O'Regan <joregan@kth.se> * works in the repl, dunno why it still breaks Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * move definition Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * another test Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * local declaration, seems to not be working Signed-off-by: Jim O'Regan <joregan@kth.se> * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * match verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix last two failing tests Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing tests for telephone and word Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused variable Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of convert_space, tests fail Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "put convert_spaces back, change test file; pytest fails" This reverts commit a7bb7489137b8026aab02aff64df39e874630043. Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails, take 2 Signed-off-by: Jim O'Regan <joregan@kth.se> * deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk Signed-off-by: Jim O'Regan <joregan@kth.se> * try converting the non-breaking spaces in the shell script Signed-off-by: Jim O'Regan <joregan@kth.se> * wrong place Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * fix path Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused Signed-off-by: Jim O'Regan <joregan@kth.se> * Update date.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Update time.py Signed-off-by: Jim O’Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix comment Signed-off-by: Jim O’Regan <joregan@kth.se> * trim comments Signed-off-by: Jim O’Regan <joregan@kth.se> * remove commented line Signed-off-by: Jim O’Regan <joregan@kth.se> * en halv Signed-off-by: Jim O’Regan <joregan@kth.se> * Update test_sparrowhawk_inverse_text_normalization.sh Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Italian_TN (#67) * add TN italian Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix init Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix LOCATION Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * modify graph_utils Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * correct decimals Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix measure Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh itn (#74) * Add ZH ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix copyrights and code cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove invalid tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve CodeQL issues Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing 'zh' option for ITN and correct comment Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update __init__.py Change to zh instead of en for the imports. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for decimal test data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * update for langauge import Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for Chinese punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * a new class for whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * PYNINI_AVAILABLE = False Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to file import format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * caught duplicates, removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates, arranges for CHInese Yuan updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates accordingly to the comments from last PR. Recreated some of the files due to format issues Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * re-added this file to avoid data file import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated gramamr according to last PR. Removed the acceptance of 千 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated according to last PR. Removed comma after decimal points Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for Fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for money and updated according to last PR. Plus process of 元 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar. updates due to the updates in cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * arrangements Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added whitelist grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * word grammar for non-classified items Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated cardinal, decimal, time, itn data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to last PR Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to the updates for cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for more Mandarin punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR. removing am pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * adjustment on the weight Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the targger updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the time tagger Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to changes in tagger on am and pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * verbalizer for fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for mandarin grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * kept this file because using English utils results in data namin error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * merge conflict Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed import os Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed variables Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue, reccreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue recreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed codeing style/format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed coding style and format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicated graph for 毛 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unnecessary comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * unnecessary comment removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test file updated for more cases Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated with a comment explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated the file explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added Mandarin as zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing for dplication Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused NEMO objects Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix test file failures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix file failtures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failture Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adap to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adapt to grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adopt to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixing pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed // for zhtn/itn cache Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Update inverse_normalize.py Added zh as a selection to pass Jenkins checks. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated pynini_export.py file to create far files (#88) Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * readd Swedish (#87) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn 0712 (#89) * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and fixings according to document on natonal gideline Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fraction updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * money updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * punctuation grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * time gramamr updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * tokenizaer updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates on certificate Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * data updated and added due to updates and chanegs to the existing grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * cardinal updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * date grammar changed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test data added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test python file edits Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for tn1.0 and previous tn grammar from contribution Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test cases updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fixed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * dates updated for init files Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated the date for zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added back the itn tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added back measure and math from previou TN Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated for tests reruns Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated weights Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn char (#95) * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code stle Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * audio-based TN fix for empty pred_text/text (#92) * fix for empty pred_text Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unittests Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix pytest Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * pip 1.2.0 Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * French tn (#91) * add tests for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add fr tn for cardinals, decimals, fractions and ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * delete it far files from tools Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add languages to run_evaluate Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * remove ambiguous spacing Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * enable sh testing for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile cache date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix test for ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update tn cache for fr Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * resolve codeql issues Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Add whitelist_tech.tsv (#96) Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zhitn 0727 (#93) * updates on itn grammar to pass sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats for sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates fro sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fix Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for coding style and sparrowhawk test Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated classes for tests on whitelist and word grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for tests on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_word.py Removed unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Removed imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Removing imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile changed zh cache to 07-27-23 as it is the latest update. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Es tn romans fix (#98) * fix es tn roman exceptions Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update eval script for ITN Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * codeql fix Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Change docker image (#102) Change docker image to one including sparrowhawk Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Print warning instead exception (#97) * raise text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * text arg Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * Failed text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * add logger Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * logger Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * NeMo-text-processing Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * info level Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Normalizer.select_verbalizer Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Exception Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * restart ci Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * warning regardless of verbose flag (#107) * warning Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * self.verbose Signed-off-by: Nikolay Karpov <karpnv@gmail.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Unpin setuptools (#106) Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixed warnings: File is not always closes. (#113) Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix bug #111 (ar currencies) (#117) * fix bug #111 (ar currencies) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci folder Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Logging clean up + IT TN fix (#118) * fix utils and it TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix logging Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * add IT TN to CI Signed-off-by: Evelina <ebakhturina@nvidia.com> * update patch Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Time_IT_TN (#105) * add time verbalizer Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add time tagger and verba Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add pytest time Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeQL Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix numbers with eight Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * rewrote tokenizer Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed the file and replaced it with char in 1.8 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins file update Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * to fix tn bug@ xuesong Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * tn bug Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixeds and updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adjustments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * testing commit Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating etst cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates adapting to graphs Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated …
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do ?
Add a one line overview of what this PR aims to accomplish.
Before your PR is "Ready for review"
Pre checks:
git commit -s
to sign.pytest
or (if your machine does not have GPU)pytest --cpu
from the root folder (given you marked your test cases accordingly@pytest.mark.run_only_on('CPU')
).bash tools/text_processing_deployment/export_grammars.sh --MODE=test ...
pytest
and Sparrowhawk here.__init__.py
for every folder and subfolder, includingdata
folder which has .TSV files?Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
to all newly added Python files?Copyright 2015 and onwards Google, Inc.
. See an example here.try import: ... except: ...
) if not already done.PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.