10000 GitHub - ishine/CMSP-ST: CMSP-ST: Cross-modal Mixup with Speech Purification for End-to-End Speech Translation
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
/ CMSP-ST Public
forked from Akito-Go/CMSP-ST

CMSP-ST: Cross-modal Mixup with Speech Purification for End-to-End Speech Translation

License

Notifications You must be signed in to change notification settings

ishine/CMSP-ST

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CMSP-ST

This is a PyTorch implementation for INTERSPEECH 2025 main conference paper "CMSP-ST: Cross-modal Mixup with Speech Purification for End-to-End Speech Translation".

Dependencies

  • Python version >= 3.8
  • Pytorch
  • To install fairseq version 0.12.2 and develop locally:
    cd fairseq
    pip install --editable ./

Train your CMSP-ST model

1.Data Preparation

  • MuST-C: Download MuST-C v1.0 dataset. Place the dataset in ./st/dataset/MuST-C/.

  • CoVoST-2: Download CoVoST-2 dataset. Place the dataset in ./st/dataset/CoVoST/.

  • HuBERT Model: Download HuBERT Base model. Place the model in ./models/pretrain/.

  • WMT: Download WMT 14 / 16 dataset. Place the dataset in ./mt/dataset/WMT/.

2.Preprocess

1) st vocab construction
  • cd ./data/st/s2t_raw/
  • bash prep_mustc_data.sh or prep_covost_data.sh
2) mt vocab construction
  • cd ./data/mt/s2t_raw/
  • bash prep_mtl_mustc_mt.sh or prep_mtl_covost_mt.sh (for multi-task learning)
  • bash prep_exp_mustc_mt.sh or prep_exp_covost_mt.sh (for expanded data)

3.MT Pretraining

1) for multi-task learning
  • cd ./scripts/pretrain/
  • bash train_mtl_mt.sh and average_cpt.sh
2) for expanded data
  • bash train_exp_mt.sh and average_cpt.sh
  • bash train_exp_mtl_mt.sh and average_cpt.sh

4.Training and Inference

  • cd ./scripts/train/
  • bash train_xxxxx_xx2xx.sh and evaluation.sh

About

CMSP-ST: Cross-modal Mixup with Speech Purification for End-to-End Speech Translation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 81.4%
  • Shell 18.6%
0