Minor documentation revisions from copyediting (#9266)
* typo: Revise "checkout" to "check out"
* typo: Change "seemlessly" to "seamlessly"
* typo: Close parentheses in "Using the tokenizer"
* typo: Add closing parenthesis to supported models aside
* docs: Treat ``position_ids`` as plural
Alternatively, the word "argument" could be added to make the subject singular.
* docs: Remove comma, making subordinate clause
* docs: Remove comma separating verb and direct object
* docs: Fix typo ("next" -> "text")
* docs: Reverse phrase order to simplify sentence
* docs: "quicktour" -> "quick tour"
* docs: "to throw" -> "from throwing"
* docs: Remove disruptive newline in padding/truncation section
* docs: "show exemplary" -> "show examples of"
* docs: "much harder as" -> "much harder than"
* docs: Fix typo "seach" -> "search"
* docs: Fix subject-verb disagreement in WordPiece description
* docs: Fix style in preprocessing.rst
This commit is contained in:
@@ -327,7 +327,7 @@ Masked Language Modeling
|
||||
Masked language modeling is the task of masking tokens in a sequence with a masking token, and prompting the model to
|
||||
fill that mask with an appropriate token. This allows the model to attend to both the right context (tokens on the
|
||||
right of the mask) and the left context (tokens on the left of the mask). Such a training creates a strong basis for
|
||||
downstream tasks, requiring bi-directional context such as SQuAD (question answering, see `Lewis, Lui, Goyal et al.
|
||||
downstream tasks requiring bi-directional context, such as SQuAD (question answering, see `Lewis, Lui, Goyal et al.
|
||||
<https://arxiv.org/abs/1910.13461>`__, part 4.2).
|
||||
|
||||
Here is an example of using pipelines to replace a mask from a sequence:
|
||||
@@ -657,7 +657,7 @@ Here are the expected results:
|
||||
{'word': 'Bridge', 'score': 0.990249514579773, 'entity': 'I-LOC'}
|
||||
]
|
||||
|
||||
Note, how the tokens of the sequence "Hugging Face" have been identified as an organisation, and "New York City",
|
||||
Note how the tokens of the sequence "Hugging Face" have been identified as an organisation, and "New York City",
|
||||
"DUMBO" and "Manhattan Bridge" have been identified as locations.
|
||||
|
||||
Here is an example of doing named entity recognition, using a model and a tokenizer. The process is the following:
|
||||
|
||||
Reference in New Issue
Block a user