8000 New SLU task by siddhu001 · Pull Request #4569 · espnet/espnet · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

New SLU task #4569

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 62 commits into from
Sep 6, 2022
Merged

New SLU task #4569

merged 62 commits into from
Sep 6, 2022

Conversation

siddhu001
Copy link
Collaborator

No description provided.

@siddhu001
Copy link
Collaborator Author

Thanks a lot @ftshijt for all your comments. I have incorporated all your suggestions and also added additional tests to meet Codecov requirements. I believe this PR is ready to be merged @ftshijt @sw005320 . Let me know if you have any other comments.

Copy link
Collaborator
@ftshijt ftshijt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to go~ Many thanks for all the implementations.

@sw005320
Copy link
Contributor

Can you give me a day?
I want to do a quick review.

Copy link
Contributor
@sw005320 sw005320 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some high-level comments.
If it is difficult to deal with it, we can leave it.

>>> speech2understand = Speech2Understand("slu_config.yml", "slu.pth")
>>> audio, rate = soundfile.read("speech.wav")
>>> speech2understand(audio)
[(text, token, token_int, hypothesis object), ...]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output looks complicated as an API.
Are all of them necessary?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to make it exactly same as Speech2Text

[(text, token, token_int, hypothesis object), ...]
so that I can use the same inference API as ASR in Hugging face. Let me know if you still think it needs to be simplified.

from espnet.utils.cli_utils import get_commandline_args


class Speech2Understand:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this class simple if we inherit a class from Speech2Text in ASR?
This is just an idea.
If it complicates your class, we do not have to do it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good suggestion! I will take a stab at this and try to simplify espnet2/bin/slu_inference.py, espnet2/slu/espnet_model.py and espnet2/tasks/slu.py using the corresponding ASR files.

yield


class ESPnetSLUModel(AbsESPnetModel):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.
Can we inherit it from ASRESPnetModel?

)


class SLUTask(AbsTask):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditt.
Can we use ASRTask?

@siddhu001
Copy link
Collaborator Author

Hi @sw005320, I have used ASRTask in espnet2/tasks/slu.py and ASRESPnetModel in espnet2/slu/espnet_model.py, which have helped avoid some code replication. espnet2/bin/slu_inference.py was not much simplified using Speech2Text class, so I have kept it as it is. I believe PR is ready to be merged (as all CI checks have passed except 1 that was cancelled).

@sw005320
Copy link
Contributor
sw005320 commented Sep 2, 2022

Cool!

@ftshijt, can we go through the review again (I'll also do it)?
I think this is a very good example of making a new task.

Copy link
Collaborator
@ftshijt ftshijt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Just a minor comment to improve the modularization.

@@ -169,6 +170,20 @@ def __init__(
token_list=token_list,
unk_symbol=unk_symbol,
)
if transcript_token_list is not None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to be task-specific. How about we just make it a new class?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I have made a new preprocessor class using the CommonPreprocessor class as the base class.

@sw005320
Copy link
Contributor
sw005320 commented Sep 5, 2022

The design looks very good to me.
Once we fix @ftshijt's comment, I'll merge this PR.

@@ -352,6 +331,88 @@ def __call__(
return data


class CommonPreprocessor_SLU(CommonPreprocessor):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can rename it to SLUPreprocessor?

@sw005320 sw005320 added New Features SLU Spoken language understanding auto-merge Enable auto-merge and removed New Features labels Sep 6, 2022
@sw005320
Copy link
Contributor
sw005320 commented Sep 6, 2022

I think we do not have to increase the coverage, but if you miss some essential functions for the test, please add them. If the current test is good enough, please let me know. Then, I'll merge this PR.

@siddhu001
Copy link
Collaborator Author

I think we do not have to increase the coverage, but if you miss some essential functions for the test, please add them. If the current test is good enough, please let me know. Then, I'll merge this PR.

Yeah I agree. I think the current test is good enough and PR is ready to be merged.

@sw005320 sw005320 merged commit f2f1b7c into espnet:master Sep 6, 2022
@sw005320
Copy link
Contributor
sw005320 commented Sep 6, 2022

I just merged the PR!
Thanks, @siddhu001!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge Enable auto-merge ESPnet2 New Features README SLU Spoken language understanding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0