From 4c3d98dddcfbffc8a83329d54fbb08a4b1ae36eb Mon Sep 17 00:00:00 2001 From: Stas Bekman Date: Thu, 3 Dec 2020 16:05:55 -0800 Subject: [PATCH] [s2s finetune_trainer] add instructions for distributed training (#8884) --- examples/seq2seq/README.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/examples/seq2seq/README.md b/examples/seq2seq/README.md index c1d599983f..d025d46c97 100644 --- a/examples/seq2seq/README.md +++ b/examples/seq2seq/README.md @@ -213,6 +213,11 @@ To see all the possible command line options, run: python finetune_trainer.py --help ``` +For multi-gpu training use `torch.distributed.launch`, e.g. with 2 gpus: +```bash +python -m torch.distributed.launch --nproc_per_node=2 finetune_trainer.py ... +``` + **At the moment, `Seq2SeqTrainer` does not support *with teacher* distillation.** All `Seq2SeqTrainer`-based fine-tuning scripts are included in the `builtin_trainer` directory.