Models doc (#7345)
* Clean up model documentation * Formatting * Preparation work * Long lines * Main work on rst files * Cleanup all config files * Syntax fix * Clean all tokenizers * Work on first models * Models beginning * FaluBERT * All PyTorch models * All models * Long lines again * Fixes * More fixes * Update docs/source/model_doc/bert.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update docs/source/model_doc/electra.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Last fixes Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
This commit is contained in:
@@ -1,5 +1,5 @@
|
||||
Perplexity of fixed-length models
|
||||
=================================
|
||||
=======================================================================================================================
|
||||
|
||||
Perplexity (PPL) is one of the most common metrics for evaluating language
|
||||
models. Before diving in, we should note that the metric applies specifically
|
||||
@@ -31,7 +31,7 @@ relationship to Bits Per Character (BPC) and data compression, check out this
|
||||
<https://thegradient.pub/understanding-evaluation-metrics-for-language-models/>`_.
|
||||
|
||||
Calculating PPL with fixed-length models
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
If we weren't limited by a model's context size, we would evaluate the
|
||||
model's perplexity by autoregressively factorizing a sequence and
|
||||
@@ -83,7 +83,7 @@ time. This allows computation to procede much faster while still giving the
|
||||
model a large context to make predictions at each step.
|
||||
|
||||
Example: Calculating perplexity with GPT-2 in 🤗 Transformers
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Let's demonstrate this process with GPT-2.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user