8000 GitHub - wyw97/TSE_PI: Official Code for Target Sound Extraction under Reverberant Environments with Pitch Information (Interspeech 2024)
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
/ TSE_PI Public

Official Code for Target Sound Extraction under Reverberant Environments with Pitch Information (Interspeech 2024)

Notifications You must be signed in to change notification settings

wyw97/TSE_PI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TSE_PI

Official Code for Target Sound Extraction under Reverberant Environments with Pitch Information (Interspeech 2024)

For the demos, please visit https://wyw97.github.io/TSE_PI/

Introduction

TSE_PI

Stage 1 Conditional Pitch Estimation (F0 Extraction)

  1. Dataset: selected from FSD50K

  2. RIR Simulation: Image-Source Method

    Reference link: https://www.audiolabs-erlangen.de/fau/professor/habets/software/smir-generator

  3. Pitch Label: Generate by Praat under anechoic condition and labeled with time shift

    Reference link: https://parselmouth.readthedocs.io/_/downloads/en/stable/pdf/

  4. TODO: Update the dataset.py (Update Soon!)

  5. Command: python train_f0.py

Stage 2 Target Sound Extraction with Pitch Information

Thanks for the open-source code from Veluri et al. for providing Waveformer and SemanticHearing.

The majority for this part is to simply add the pitch information and train similarly to the Waveformer.

Reference Code

  1. https://github.com/vb000/Waveformer/

  2. https://github.com/vb000/SemanticHearing

  3. https://github.com/lihan941002/Param-GTFB-GCFB

About

Official Code for Target Sound Extraction under Reverberant Environments with Pitch Information (Interspeech 2024)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0