TSE_PI

Official Code for Target Sound Extraction under Reverberant Environments with Pitch Information (Interspeech 2024)

Introduction

Dataset: selected from FSD50K
RIR Simulation: Image-Source Method

Reference link: https://www.audiolabs-erlangen.de/fau/professor/habets/software/smir-generator
Pitch Label: Generate by Praat under anechoic condition and labeled with time shift

Reference link: https://parselmouth.readthedocs.io/_/downloads/en/stable/pdf/
TODO: Update the dataset.py (Update Soon!)
Command: python train_f0.py

Thanks for the open-source code from Veluri et al. for providing Waveformer and SemanticHearing.

The majority for this part is to simply add the pitch information and train similarly to the Waveformer.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
Stage1_F0Estimation/code		Stage1_F0Estimation/code
Stage2_TSE/code/src		Stage2_TSE/code/src
TSE_PI_Demo		TSE_PI_Demo
README.md		README.md
index.html		index.html