[s2s] Don't mention packed data in README (#6079)

This commit is contained in:
Sam Shleifer
2020-07-27 20:07:21 -04:00
committed by GitHub
parent b7345d22d0
commit 7a68d40138

View File

@@ -89,20 +89,20 @@ Then you can finetune mbart_cc25 on english-romanian with the following command.
Best performing command: Best performing command:
```bash ```bash
# optionally # optionally
export ENRO_DIR='wmt_en_ro_packed_train_200' # Download instructions above export ENRO_DIR='wmt_en_ro' # Download instructions above
# export WANDB_PROJECT="MT" # optional # export WANDB_PROJECT="MT" # optional
export MAX_LEN=200 export MAX_LEN=200
export BS=4 export BS=4
export GAS=8 # gradient accumulation steps export GAS=8 # gradient accumulation steps
./train_mbart_cc25_enro.sh --output_dir enro_finetune_baseline --label_smoothing 0.1 --fp16_opt_level=O1 --logger_name wandb --sortish_sampler ./train_mbart_cc25_enro.sh --output_dir enro_finetune_baseline --label_smoothing 0.1 --fp16_opt_level=O1 --logger_name wandb --sortish_sampler
``` ```
This should take < 2h/epoch on a 16GB v100 and achieve val_avg_ BLEU score above 25. (you can see in wandb or metrics.json). This should take < 6h/epoch on a 16GB v100 and achieve val_avg_ BLEU score above 25. (you can see metrics in wandb or metrics.json).
To get results in line with fairseq, you need to do some postprocessing. To get results in line with fairseq, you need to do some postprocessing.
MultiGPU command MultiGPU command
(using 8 GPUS as an example) (using 8 GPUS as an example)
```bash ```bash
export ENRO_DIR='wmt_en_ro_packed_train_200' # Download instructions above export ENRO_DIR='wmt_en_ro' # Download instructions above
# export WANDB_PROJECT="MT" # optional # export WANDB_PROJECT="MT" # optional
export MAX_LEN=200 export MAX_LEN=200
export BS=4 export BS=4