-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Add dns_icassp22 Speech Enhancement Recipe #4657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Could you format the Python scripts with
|
Just want to make sure. So I need to run these from the root directory, through all pieces of code in the package? |
To be precise, you need to go to the root directory of espnet and run these commands. |
@Emrys365 Hi, I've run |
Codecov Report
@@ Coverage Diff @@
## master #4657 +/- ##
==========================================
- Coverage 83.09% 83.09% -0.01%
==========================================
Files 518 518
Lines 44777 44700 -77
==========================================
- Hits 37207 37143 -64
+ Misses 7570 7557 -13
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw several files are from the official DNS repo. Maybe you can add git clone in the local/data.sh and then use the cloned files instead of copying them.
egs2/TEMPLATE/asr1/db.sh
Outdated
@@ -17,6 +17,7 @@ DIRHA_WSJ_PROCESSED="${PWD}/data/local/dirha_wsj_processed" # Output file path | |||
DNS= | |||
DNS2= | |||
DNS3= | |||
DNS4="${PWD}/data_dns4_raw" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to use DNS4
or downloads
here to follow the convention instead of data_dns4_raw
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok! will change to downloads
- pytorch version: `pytorch 1.12.1+cu102` | ||
- Git hash: `13db69d3befc3c82a5ff5a11e28bf79d5030603f` | ||
- Commit date: `Mon Aug 29 13:44:35 2022 +0000` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about uploading the model and add it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I'm waiting for the model to train on 48kHz, full dataset (which might take several days more)
I guess you're talking about btw, in this case, the formatting won't conform to |
@slSeanWU If the file is not in this repo, formatting would not be a problem. |
|
||
# get noisy wav synthesizer config file | ||
cd ${PATH_PREFIX} | ||
wget https://raw.githubusercontent.com/microsoft/DNS-Challenge/master/noisyspeech_synthesizer.cfg -O noisyspeech_synthesizer.cfg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the action. Sorry I didn't make it clear. Can you use git clone and specify the commit hash information? Then use the file from cloned repo. This is just to avoid the possible incompatibility due to the future updates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This way, does it mean that we must clone the entire repo rather than just getting the files we need?
Or, is there a way we can just fetch those specific files? Thanks a lot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could clone only a specific branch by
git clone -b [branch_name] ...
cd DNS-Challenge | ||
git checkout 5582dcf5ba43155621de72a035eb54a7d233af14 | ||
cp noisyspeech_synthesizer.cfg ../ | ||
cp audiolib.py $RUN_DIR/local/ | ||
cp utils.py $RUN_DIR/local/ | ||
echo "Required scripts & files fetched from DNS-Challenge repo" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use these to fetch the files now, do they look okay to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@@ -26,4 +26,5 @@ test_sets="cv_synthetic tt_synthetic" | |||
--use_noise_ref false \ | |||
--max_wav_duration 31 \ | |||
--inference_model "valid.loss.best.pth" \ | |||
--nj 8 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps you could delete this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this line since it's much easier to run into OOM error with the default nj=32
when we're working with 48 kHz.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean in the preprocessing stages
Thanks, @slSeanWU! |
Hello,
Two things to note:
(1) I checked how DNS challenges evaluate submissions. They use subjective MOS on recorded test samples that don't come with a "clean reference", so I think it'll be hard for us to compare with those submitted implementations.
(2) The original script from DNS repo to synthesize training examples in single-threaded, hence very slow. I refactored it using
multiprocessing
to enable parallelization.Thanks in advance for reviews & comments,
Shih-Lun