Remove outdated BERT tips (#6217)

* Remove out-dated BERT tips * Update modeling_outputs.py * Update bert.rst * Update bert.rst
2020-08-04 01:17:56 +08:00
parent e4920c92d6
commit 3c289fb38c
2 changed files with 2 additions and 11 deletions
--- a/docs/source/model_doc/bert.rst
+++ b/docs/source/model_doc/bert.rst
@@ -27,13 +27,8 @@ Tips:

 - BERT is a model with absolute position embeddings so it's usually advised to pad the inputs on
  the right rather than the left.
- BERT was trained with a masked language modeling (MLM) objective. It is therefore efficient at predicting masked
-  tokens and at NLU in general, but is not optimal for text generation. Models trained with a causal language
-  modeling (CLM) objective are better in that regard.
- Alongside MLM, BERT was trained using a next sentence prediction (NSP) objective using the [CLS] token as a sequence
-  approximate. The user may use this token (the first token in a sequence built with special tokens) to get a sequence
-  prediction rather than a token prediction. However, averaging over the sequence may yield better results than using
-  the [CLS] token.
+- BERT was trained with the masked language modeling (MLM) and next sentence prediction (NSP) objectives. It is efficient at predicting masked
+  tokens and at NLU in general, but is not optimal for text generation.

 The original code can be found `here <https://github.com/google-research/bert>`_.