-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Support Multi-Blank Transducer in Espnet2 #4876
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Still in progress. I'll report some results on Librispeech when this PR is fully validated. |
merge remote espnet into local for push
The current version is validated on Librispeech dataset. Below are the results.
Several hints for this method:
I am happy to collaborate so that this PR can pass the CI. @ftshijt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jctian98 ,
It would be better to move the rnnt_multi_blank
folder under asr/transducer
instead of asr_transducer
.
The second folder is reserved for the standalone version of ESPnet2-Transducer while the first is for the shared version with CTC/Att, which you're working with.
P.S: Unrelated but is the RTF you reported for vanilla Transducer with greedy search also? It seems really slow.
Congrats on making this work with ESPnet and thanks! |
I agree |
Can you apply black? |
All decoding is with greedy search. It is slow because I use
Moved
Didn't know it. Have applied now.
Thanks for your kind support! |
I'll review the PR in a few days, thanks again for this contribution @jctian98
OK great, thank you! You can look at Btw, do you have WER and RTF for the vanilla Transducer with default beam search, please? Edit: You duplicated the |
Here I update the experimental results like below:
I have checked I have also report the WER and RTF for vanilla transducer beam search. No LM used. I have removed CI suggests the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI suggests the black should be applied to two files. But it seems applying black to these files on my local machine doesn't change them. Sorry I don't know how to get through this. Please help. Thanks! :)
Below are the CI flagged issues in your PR. There are also 2 more in asr_inference.py
l.419
and l.686
:
else:
# Normal ASR
You can skip them in this PR I think. The files are correctly formated in master, you should update your branch.
Maybe some bottleneck but I'll take a look. Did you use Edit: Wait,
Thank you for the report! Well, that's a rather high RTF! |
Co-authored-by: Florian Boyer <41155456+b-flo@users.noreply.github.com>
Co-authored-by: Florian Boyer <41155456+b-flo@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. It can be merged after CI fixes (most of them come from loss computation files).
I agree.
This is a nice idea. |
@jctian98, I think the current CI issue comes from |
Hi, sorry for my late reply. (1) On my local machine, the command (2) In addition, when running
But the file mentioned above is not modified in this PR. (3) In the CI check log, there are some comments suggesting I'm a bit confused about the current status. Please let me know what should I do next. |
Can you check the versions of the test tools (used in your local and used in CI)? |
On my local machine, I'm with Sorry, I don't understand what the versions used in C are. |
used in CI |
Hi, with following this, I have also changed to |
@ftshijt, do you know what’s happening? |
for more information, see https://pre-commit.ci
Codecov Report
@@ Coverage Diff @@
## master #4876 +/- ##
==========================================
- Coverage 77.03% 75.86% -1.17%
==========================================
Files 606 615 +9
Lines 53755 54684 +929
==========================================
+ Hits 41409 41487 +78
- Misses 12346 13197 +851
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
@jctian98, we need to add
|
Will work on it soon! |
This is a follow-up.
Unit Test: see I suppose the test code may need an additional review. The model has been uploaded to HF. |
Thanks! @hainan-xv, in the future, it would be great if you also prepare the CPU version for debugging and CI purposes. |
This PR attempts to support multi-blank transducer in Espnet 2
Paper related: https://arxiv.org/abs/2211.03541
Acknowledgment: The original implementation is from NeMo (https://github.com/NVIDIA/NeMo) and by @hainan-xv. The original PR in NeMo is NVIDIA/NeMo#5527