Add video links to the documentation (#12162)

This commit is contained in:
Sylvain Gugger
2021-06-15 06:37:37 -04:00
committed by GitHub
parent 040283170c
commit a55dc157e3
7 changed files with 167 additions and 26 deletions

View File

@@ -39,6 +39,12 @@ To automatically download the vocab used during pretraining or fine-tuning a giv
Base use
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. raw:: html
<iframe width="560" height="315" src="https://www.youtube.com/embed/Yffk5aydLzg" title="YouTube video player"
frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope;
picture-in-picture" allowfullscreen></iframe>
A :class:`~transformers.PreTrainedTokenizer` has many methods, but the only one you need to remember for preprocessing
is its ``__call__``: you just need to feed your sentence to your tokenizer object.
@@ -138,6 +144,12 @@ can safely ignore it. You can also pass ``verbose=False`` to stop the tokenizer
Preprocessing pairs of sentences
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. raw:: html
<iframe width="560" height="315" src="https://www.youtube.com/embed/0u3ioSwev3s" title="YouTube video player"
frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope;
picture-in-picture" allowfullscreen></iframe>
Sometimes you need to feed a pair of sentences to your model. For instance, if you want to classify if two sentences in
a pair are similar, or for question-answering models, which take a context and a question. For BERT models, the input
is then represented like this: :obj:`[CLS] Sequence A [SEP] Sequence B [SEP]`