Add "Leveraging Pretrained Checkpoints for Generation" Seq2Seq models. (#6594)

* add conversion script * improve conversion script * make style * add tryout files * fix * update * add causal bert * better names * add tokenizer file as well * finish causal_bert * fix small bugs * improve generate * change naming * renaming * renaming * renaming * remove leftover files * clean files * add fix tokenizer * finalize * correct slow test * update docs * small fixes * fix link * adapt check repo * apply sams and sylvains recommendations * fix import * implement Lysandres recommendations * fix logger warn
2020-09-10 16:40:51 +02:00
parent d1691d90e5
commit 7fd1febf38
20 changed files with 1508 additions and 9 deletions
--- a/src/transformers/generation_utils.py
+++ b/src/transformers/generation_utils.py
@@ -383,7 +383,11 @@ class GenerationMixin:
                # see if BOS token can be used for decoder_start_token_id
                if bos_token_id is not None:
                    decoder_start_token_id = bos_token_id
-                elif hasattr(self.config, "decoder") and hasattr(self.config.decoder, "bos_token_id"):
+                elif (
+                    hasattr(self.config, "decoder")
+                    and hasattr(self.config.decoder, "bos_token_id")
+                    and self.config.decoder.bos_token_id is not None
+                ):
                    decoder_start_token_id = self.config.decoder.bos_token_id
                else:
                    raise ValueError(