[pl_examples] default warmup steps=0 (#5316)

2020-06-26 15:03:41 -04:00
parent bf0d12c220
commit 5543b30aa6
6 changed files with 14 additions and 13 deletions
--- a/examples/seq2seq/README.md
+++ b/examples/seq2seq/README.md
@@ -64,6 +64,7 @@ The following command should work on a 16GB GPU:

 Tips:
 - 1 epoch at batch size 1 for bart-large takes 24 hours and requires 13GB GPU RAM with fp16 on an NVIDIA-V100. 
+- since you need to run from `examples/seq2seq`, and likely need to modify code, it is easiest to fork, then clone transformers and run `pip install -e .` before you get started.   
 - try `bart-base`, `--freeze_encoder` or `--freeze_embeds` for faster training/larger batch size.  (3hr/epoch with bs=8, see the "xsum_shared_task" command below)
 - `fp16_opt_level=O1` (the default works best).
 - If you are finetuning on your own dataset, start from `distilbart-cnn-12-6` if you want long summaries and `distilbart-xsum-12-6` if you want short summaries.