Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation

PyTorch implementation of the models described in the paper Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation .

Dependencies

Python

Python 3.6
PyTorch 0.4
Numpy
NLTK
torchtext
torchvision
revtok
multiset
ipdb

GPU

CUDA (we recommend using the latest version. The version 8.0 was used in all our experiments.)

Related code

This code is based on dl4mt-nonauto. We mainly modified the model.py (line 1103-1199).

Downloading Datasets

The original translation corpora can be downloaded from (IWLST'16 En-De, WMT'16 En-Ro, WMT'14 En-De). We recommend you to download the preprocessed corpora released in dl4mt-nonauto.

Before you run the code

Set correct path to data in data_path() function located in data.py:

Training New Models

Train a NAT model using the cross-entropy loss. This process usually takes about 10 days. You can download our pretrained models here

IWSLT

$ sh train_iwslt.sh

WMT14 En-De

$ sh rf_wmt.sh

Finetuning (RF-NAT)

Take a checkpoint p 8000 re-trained non-autoregressive model and finetune the checkpoint using the RF-NAT algorithm. This process usually takes about 1 days. If you want to use the origin REINFORCE, change the flag --nat_finetune to --rf_finetune.

IWSLT

$ sh rf_iwslt.sh

WMT14 En-De

$ sh rf_wmt.sh

Training the Length Prediction Model

Take a finetuned checkpoint and train the length prediction model. This process usually takes about 1 day.

IWSLT

$ sh tune_iwslt.sh

WMT14 En-De

$ sh tune_wmt.sh

Decoding

Decode the test set. This process usually takes about 20 seconds.

IWSLT

$ sh decode_iwslt.sh

WMT14 En-De

$ sh decode_wmt.sh

Citation

If you find the resources in this repository useful, please consider citing:

@inproceedings{shao2019retrieving,
    title = "Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation",
    author = "Shao, Chenze  and
      Feng, Yang  and
      Zhang, Jinchao  and
      Meng, Fandong  and
      Chen, Xilin  and
      Zhou, Jie",
    booktitle = "Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics",
    month = jul,
    year = "2019",
    url = "https://www.aclweb.org/anthology/P19-1288",
    pages = "3013--3024",
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
LICENSE		LICENSE
LICENSE_nyu		LICENSE_nyu
README.md		README.md
data.py		data.py
decode.py		decode.py
decode_iwslt.sh		decode_iwslt.sh
decode_wmt.sh		decode_wmt.sh
distill.py		distill.py
model.py		model.py
mscoco.py		mscoco.py
rf_iwslt.sh		rf_iwslt.sh
rf_wmt.sh		rf_wmt.sh
run.py		run.py
slides.pdf		slides.pdf
test.py		test.py
train.py		train.py
train_iwslt.sh		train_iwslt.sh
train_wmt.sh		train_wmt.sh
tune_iwslt.sh		tune_iwslt.sh
tune_wmt.sh		tune_wmt.sh
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Uh oh!

Repository files navigation

Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation

Dependencies

Python

GPU

Related code

Downloading Datasets

Before you run the code

Training New Models

IWSLT

WMT14 En-De

Finetuning (RF-NAT)

IWSLT

WMT14 En-De

Training the Length Prediction Model

IWSLT

WMT14 En-De

Decoding

IWSLT

WMT14 En-De

Citation

About

Licenses found

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Licenses found

ictnlp/RSI-NAT

Folders and files

Latest commit

History

Repository files navigation

Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation

Dependencies

Python

GPU

Related code

Downloading Datasets

Before you run the code

Training New Models

IWSLT

WMT14 En-De

Finetuning (RF-NAT)

IWSLT

WMT14 En-De

Training the Length Prediction Model

IWSLT

WMT14 En-De

Decoding

IWSLT

WMT14 En-De

Citation

About

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages