From 7a68d401388bc68f10dfeb591709352736a6c0b6 Mon Sep 17 00:00:00 2001 From: Sam Shleifer Date: Mon, 27 Jul 2020 20:07:21 -0400 Subject: [PATCH] [s2s] Don't mention packed data in README (#6079) --- examples/seq2seq/README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/examples/seq2seq/README.md b/examples/seq2seq/README.md index 5029f38361..a579d728b5 100644 --- a/examples/seq2seq/README.md +++ b/examples/seq2seq/README.md @@ -89,20 +89,20 @@ Then you can finetune mbart_cc25 on english-romanian with the following command. Best performing command: ```bash # optionally -export ENRO_DIR='wmt_en_ro_packed_train_200' # Download instructions above +export ENRO_DIR='wmt_en_ro' # Download instructions above # export WANDB_PROJECT="MT" # optional export MAX_LEN=200 export BS=4 export GAS=8 # gradient accumulation steps ./train_mbart_cc25_enro.sh --output_dir enro_finetune_baseline --label_smoothing 0.1 --fp16_opt_level=O1 --logger_name wandb --sortish_sampler ``` -This should take < 2h/epoch on a 16GB v100 and achieve val_avg_ BLEU score above 25. (you can see in wandb or metrics.json). +This should take < 6h/epoch on a 16GB v100 and achieve val_avg_ BLEU score above 25. (you can see metrics in wandb or metrics.json). To get results in line with fairseq, you need to do some postprocessing. MultiGPU command (using 8 GPUS as an example) ```bash -export ENRO_DIR='wmt_en_ro_packed_train_200' # Download instructions above +export ENRO_DIR='wmt_en_ro' # Download instructions above # export WANDB_PROJECT="MT" # optional export MAX_LEN=200 export BS=4