Remove outdated BERT tips (#6217)

* Remove out-dated BERT tips

* Update modeling_outputs.py

* Update bert.rst

* Update bert.rst
This commit is contained in:
Kevin Canwen Xu
2020-08-04 01:17:56 +08:00
committed by GitHub
parent e4920c92d6
commit 3c289fb38c
2 changed files with 2 additions and 11 deletions

View File

@@ -27,13 +27,8 @@ Tips:
- BERT is a model with absolute position embeddings so it's usually advised to pad the inputs on
the right rather than the left.
- BERT was trained with a masked language modeling (MLM) objective. It is therefore efficient at predicting masked
tokens and at NLU in general, but is not optimal for text generation. Models trained with a causal language
modeling (CLM) objective are better in that regard.
- Alongside MLM, BERT was trained using a next sentence prediction (NSP) objective using the [CLS] token as a sequence
approximate. The user may use this token (the first token in a sequence built with special tokens) to get a sequence
prediction rather than a token prediction. However, averaging over the sequence may yield better results than using
the [CLS] token.
- BERT was trained with the masked language modeling (MLM) and next sentence prediction (NSP) objectives. It is efficient at predicting masked
tokens and at NLU in general, but is not optimal for text generation.
The original code can be found `here <https://github.com/google-research/bert>`_.