[s2s]Use prepare_translation_batch for Marian finetuning (#6293)

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-08-06 14:58:38 -04:00
parent 2f2aa0c89c
commit 2804fff839
5 changed files with 22 additions and 12 deletions
--- a/examples/seq2seq/README.md
+++ b/examples/seq2seq/README.md
@@ -63,7 +63,7 @@ Summarization Tips:
 (It rarely makes sense to start from `bart-large` unless you are a researching finetuning methods).

 **Update 2018-07-18**
-Datasets: Seq2SeqDataset will be used for all models besides MBart, for which MBartDataset will be used.**
+Datasets: `Seq2SeqDataset` should be used for all tokenizers without a `prepare_translation_batch` method. For those who do (like Marian, MBart), `TranslationDataset` should be used.**
 A new dataset is needed to support multilingual tasks.