8000 Add dns_icassp22 Speech Enhancement Recipe by slSeanWU · Pull Request #4657 · espnet/espnet · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add dns_icassp22 Speech Enhancement Recipe #4657

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Oct 4, 2022
Merged

Conversation

slSeanWU
Copy link
Contributor

Hello,

Two things to note:
(1) I checked how DNS challenges evaluate submissions. They use subjective MOS on recorded test samples that don't come with a "clean reference", so I think it'll be hard for us to compare with those submitted implementations.

(2) The original script from DNS repo to synthesize training examples in single-threaded, hence very slow. I refactored it using multiprocessing to enable parallelization.

Thanks in advance for reviews & comments,
Shih-Lun

@Emrys365
Copy link
Collaborator
Emrys365 commented Sep 22, 2022

Could you format the Python scripts with black, flake8, and isort to resolve the CI test errors?

python -m pip install flake8-docstrings flake8 black isort

black espnet espnet2 test utils setup.py egs*/*/*/local egs2/TEMPLATE/*/pyscripts tools/*.py ci/*.py

flake8 --show-source --extend-ignore=D test utils doc espnet2 test/espnet2 egs/*/*/local/*.py

isort espnet espnet2 test utils setup.py egs*/*/*/local egs2/TEMPLATE/*/pyscripts tools/*.py ci/*.py

@slSeanWU
Copy link
Contributor Author

Could you format the Python scripts with black, flake8, and isort to resolve the CI test errors?

python -m pip install flake8-docstrings flake8 black isort

black espnet espnet2 test utils setup.py egs*/*/*/local egs2/TEMPLATE/*/pyscripts tools/*.py ci/*.py

flake8 --show-source --extend-ignore=D test utils doc espnet2 test/espnet2 egs/*/*/local/*.py

isort espnet espnet2 test utils setup.py egs*/*/*/local egs2/TEMPLATE/*/pyscripts tools/*.py ci/*.py

Just want to make sure. So I need to run these from the root directory, through all pieces of code in the package?

@Emrys365
Copy link
Collaborator

Just want to make sure. So I need to run these from the root directory, through all pieces of code in the package?

To be precise, you need to go to the root directory of espnet and run these commands.

@slSeanWU
Copy link
Contributor Author

@Emrys365 Hi, I've run isort and black to fix the formatting of my python scripts. flake8 throws lots of warning throughout the entire package, but I guess most of them are minor (lines too long, unused imports/variables, etc.). The CI checks are still not passing, though...

@codecov
Copy link
codecov bot commented Sep 28, 2022

Codecov Report

Merging #4657 (d63a9e1) into master (4ae6822) will decrease coverage by 0.00%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #4657      +/-   ##
==========================================
- Coverage   83.09%   83.09%   -0.01%     
==========================================
  Files         518      518              
  Lines       44777    44700      -77     
==========================================
- Hits        37207    37143      -64     
+ Misses       7570     7557      -13     
Flag Coverage Δ
test_integration_espnet1 66.37% <ø> (+<0.01%) ⬆️
test_integration_espnet2 49.36% <ø> (+0.03%) ⬆️
test_python 70.91% <ø> (-0.03%) ⬇️
test_utils 23.30% <ø> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
espnet/nets/pytorch_backend/ctc.py 54.09% <0.00%> (-10.25%) ⬇️
espnet2/iterators/sequence_iter_factory.py 93.58% <0.00%> (-3.85%) ⬇️
espnet2/asr/ctc.py 95.65% <0.00%> (-1.07%) ⬇️
espnet/nets/pytorch_backend/e2e_asr_transformer.py 95.72% <0.00%> (-0.36%) ⬇️
espnet/nets/chainer_backend/ctc.py 97.22% <0.00%> (+0.34%) ⬆️
espnet/nets/chainer_backend/e2e_asr_transformer.py 69.59% <0.00%> (+0.69%) ⬆️
espnet/nets/chainer_backend/e2e_asr.py 95.87% <0.00%> (+1.03%) ⬆️
espnet/distributed/pytorch_backend/launch.py 83.90% <0.00%> (+1.14%) ⬆️
.../nets/pytorch_backend/transformer/encoder_layer.py 98.07% <0.00%> (+3.84%) ⬆️
espnet/nets/chainer_backend/transformer/ctc.py 33.33% <0.00%> (+5.55%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

8000
Copy link
Collaborator
@simpleoier simpleoier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw several files are from the official DNS repo. Maybe you can add git clone in the local/data.sh and then use the cloned files instead of copying them.

@@ -17,6 +17,7 @@ DIRHA_WSJ_PROCESSED="${PWD}/data/local/dirha_wsj_processed" # Output file path
DNS=
DNS2=
DNS3=
DNS4="${PWD}/data_dns4_raw"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to use DNS4 or downloads here to follow the convention instead of data_dns4_raw

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok! will change to downloads

- pytorch version: `pytorch 1.12.1+cu102`
- Git hash: `13db69d3befc3c82a5ff5a11e28bf79d5030603f`
- Commit date: `Mon Aug 29 13:44:35 2022 +0000`

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about uploading the model and add it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I'm waiting for the model to train on 48kHz, full dataset (which might take several days more)

@slSeanWU
Copy link
Contributor Author

I saw several files are from the official DNS repo. Maybe you can add git clone in the local/data.sh and then use the cloned files instead of copying them.

I guess you're talking about audiolib.py and utils.py, right? We can definitely use wget to download them in the data downloading script. (I don't know if we can just fetch a few files with git clone.)

btw, in this case, the formatting won't conform to isort, black, and flake8 rules. Would this be alright?

@simpleoier
Copy link
Collaborator

@slSeanWU If the file is not in this repo, formatting would not be a problem.


# get noisy wav synthesizer config file
cd ${PATH_PREFIX}
wget https://raw.githubusercontent.com/microsoft/DNS-Challenge/master/noisyspeech_synthesizer.cfg -O noisyspeech_synthesizer.cfg
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the action. Sorry I didn't make it clear. Can you use git clone and specify the commit hash information? Then use the file from cloned repo. This is just to avoid the possible incompatibility due to the future updates.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way, does it mean that we must clone the entire repo rather than just getting the files we need?

Or, is there a way we can just fetch those specific files? Thanks a lot.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could clone only a specific branch by

git clone -b [branch_name] ...

Comment on lines +216 to +223
cd DNS-Challenge
git checkout 5582dcf5ba43155621de72a035eb54a7d233af14
cp noisyspeech_synthesizer.cfg ../
cp audiolib.py $RUN_DIR/local/
cp utils.py $RUN_DIR/local/
echo "Required scripts & files fetched from DNS-Challenge repo"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use these to fetch the files now, do they look okay to you?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@@ -26,4 +26,5 @@ test_sets="cv_synthetic tt_synthetic"
--use_noise_ref false \
--max_wav_duration 31 \
--inference_model "valid.loss.best.pth" \
--nj 8 \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps you could delete this line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this line since it's much easier to run into OOM error with the default nj=32 when we're working with 48 kHz.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean in the preprocessing stages

@sw005320 sw005320 merged commit ebdd5c7 into espnet:master Oct 4, 2022
@sw005320
Copy link
Contributor
sw005320 commented Oct 4, 2022

Thanks, @slSeanWU!
This is a very cool contribution!
After you finish training, please upload a model to the HF hub and add a link to egs2/dns_icassp22/enh1/README.md in the follow-up PR.

@Emrys365 Emrys365 added the SE Speech enhancement label Apr 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ESPnet2 README SE Speech enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0