Skip to content

Latest commit

 

History

History
48 lines (37 loc) · 1.62 KB

README.md

File metadata and controls

48 lines (37 loc) · 1.62 KB

Understanding Back-Translation at Scale (Edunov et al., 2018)

This page includes pre-trained models from the paper Understanding Back-Translation at Scale (Edunov et al., 2018).

Pre-trained models

Model Description Dataset Download
transformer.wmt18.en-de Transformer
(Edunov et al., 2018)
WMT'18 winner
WMT'18 English-German download (.tar.gz)
See NOTE in the archive

Example usage (torch.hub)

We require a few additional Python dependencies for preprocessing:

pip install subword_nmt sacremoses

Then to generate translations from the full model ensemble:

import torch

# List available models
torch.hub.list('pytorch/fairseq')  # [..., 'transformer.wmt18.en-de', ... ]

# Load the WMT'18 En-De ensemble
en2de_ensemble = torch.hub.load(
    'pytorch/fairseq', 'transformer.wmt18.en-de',
    checkpoint_file='wmt18.model1.pt:wmt18.model2.pt:wmt18.model3.pt:wmt18.model4.pt:wmt18.model5.pt',
    tokenizer='moses', bpe='subword_nmt')

# The ensemble contains 5 models
len(en2de_ensemble.models)
# 5

# Translate
en2de_ensemble.translate('Hello world!')
# 'Hallo Welt!'

Citation

@inproceedings{edunov2018backtranslation,
  title = {Understanding Back-Translation at Scale},
  author = {Edunov, Sergey and Ott, Myle and Auli, Michael and Grangier, David},
  booktitle = {Conference of the Association for Computational Linguistics (ACL)},
  year = 2018,
}