docs: Resolve many typos in the English docs (#20088)

* docs: Fix typo in ONNX parser help: 'tolerence' => 'tolerance' * docs: Resolve many typos in the English docs Typos found via 'codespell ./docs/source/en'
2022-11-07 15:19:04 +01:00
parent b8112eddec
commit 3222fc645b
21 changed files with 29 additions and 29 deletions
--- a/docs/source/en/model_doc/bloom.mdx
+++ b/docs/source/en/model_doc/bloom.mdx
@@ -32,7 +32,7 @@ A list of official Hugging Face and community (indicated by 🌎) resources to h

 <PipelineTag pipeline="text-generation"/>

- [`BloomForCausalLM`] is suppported by this [causal language modeling example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling#gpt-2gpt-and-causal-language-modeling) and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling.ipynb).
+- [`BloomForCausalLM`] is supported by this [causal language modeling example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling#gpt-2gpt-and-causal-language-modeling) and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling.ipynb).

 ⚡️ Inference
 - A blog on [Optimization story: Bloom inference](https://huggingface.co/blog/bloom-inference-optimization).
--- a/docs/source/en/model_doc/gpt2.mdx
+++ b/docs/source/en/model_doc/gpt2.mdx
@@ -61,7 +61,7 @@ A list of official Hugging Face and community (indicated by 🌎) resources to h
 - A notebook on how to [finetune GPT2 to generate lyrics in the style of your favorite artist](https://colab.research.google.com/github/AlekseyKorshuk/huggingartists/blob/master/huggingartists-demo.ipynb). 🌎
 - A notebook on how to [finetune GPT2 to generate tweets in the style of your favorite Twitter user](https://colab.research.google.com/github/borisdayma/huggingtweets/blob/master/huggingtweets-demo.ipynb). 🌎
 - [Causal language modeling](https://huggingface.co/course/en/chapter7/6?fw=pt#training-a-causal-language-model-from-scratch) chapter of the 🤗 Hugging Face Course.
- [`GPT2LMHeadModel`] is suppported by this [causal language modeling example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling#gpt-2gpt-and-causal-language-modeling), [text generation example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/text-generation), and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling.ipynb).
+- [`GPT2LMHeadModel`] is supported by this [causal language modeling example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling#gpt-2gpt-and-causal-language-modeling), [text generation example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/text-generation), and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling.ipynb).
 - [`TFGPT2LMHeadModel`] is supported by this [causal language modeling example script](https://github.com/huggingface/transformers/tree/main/examples/tensorflow/language-modeling#run_clmpy) and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling-tf.ipynb).
 - [`FlaxGPT2LMHeadModel`] is supported by this [causal language modeling example script](https://github.com/huggingface/transformers/tree/main/examples/flax/language-modeling#causal-language-modeling) and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/causal_language_modeling_flax.ipynb).

--- a/docs/source/en/model_doc/gptj.mdx
+++ b/docs/source/en/model_doc/gptj.mdx
@@ -47,7 +47,7 @@ Tips:
  that could be found [here](https://github.com/kingoflolz/mesh-transformer-jax/blob/master/howto_finetune.md)

 - Although the embedding matrix has a size of 50400, only 50257 entries are used by the GPT-2 tokenizer. These extra
-  tokens are added for the sake of efficiency on TPUs. To avoid the mis-match between embedding matrix size and vocab
+  tokens are added for the sake of efficiency on TPUs. To avoid the mismatch between embedding matrix size and vocab
  size, the tokenizer for [GPT-J](https://huggingface.co/EleutherAI/gpt-j-6B) contains 143 extra tokens
  `<|extratoken_1|>... <|extratoken_143|>`, so the `vocab_size` of tokenizer also becomes 50400.

--- a/docs/source/en/model_doc/longformer.mdx
+++ b/docs/source/en/model_doc/longformer.mdx
@@ -40,7 +40,7 @@ This model was contributed by [beltagy](https://huggingface.co/beltagy). The Aut

 Longformer self attention employs self attention on both a "local" context and a "global" context. Most tokens only
 attend "locally" to each other meaning that each token attends to its \\(\frac{1}{2} w\\) previous tokens and
-\\(\frac{1}{2} w\\) succeding tokens with \\(w\\) being the window length as defined in
+\\(\frac{1}{2} w\\) succeeding tokens with \\(w\\) being the window length as defined in
 `config.attention_window`. Note that `config.attention_window` can be of type `List` to define a
 different \\(w\\) for each layer. A selected few tokens attend "globally" to all other tokens, as it is
 conventionally done for all tokens in `BertSelfAttention`.
--- a/docs/source/en/model_doc/mobilevit.mdx
+++ b/docs/source/en/model_doc/mobilevit.mdx
@@ -26,7 +26,7 @@ Tips:
 - One can use [`MobileViTFeatureExtractor`] to prepare images for the model. Note that if you do your own preprocessing, the pretrained checkpoints expect images to be in BGR pixel order (not RGB).
 - The available image classification checkpoints are pre-trained on [ImageNet-1k](https://huggingface.co/datasets/imagenet-1k) (also referred to as ILSVRC 2012, a collection of 1.3 million images and 1,000 classes).
 - The segmentation model uses a [DeepLabV3](https://arxiv.org/abs/1706.05587) head. The available semantic segmentation checkpoints are pre-trained on [PASCAL VOC](http://host.robots.ox.ac.uk/pascal/VOC/). 
- As the name suggests MobileViT was desgined to be performant and efficient on mobile phones. The TensorFlow versions of the MobileViT models are fully compatible with [TensorFlow Lite](https://www.tensorflow.org/lite). 
+- As the name suggests MobileViT was designed to be performant and efficient on mobile phones. The TensorFlow versions of the MobileViT models are fully compatible with [TensorFlow Lite](https://www.tensorflow.org/lite). 

  You can use the following code to convert a MobileViT checkpoint (be it image classification or semantic segmentation) to generate a 
  TensorFlow Lite model:
--- a/docs/source/en/model_doc/mt5.mdx
+++ b/docs/source/en/model_doc/mt5.mdx
@@ -28,7 +28,7 @@ generative model chooses to (partially) translate its prediction into the wrong
 checkpoints used in this work are publicly available.*

 Note: mT5 was only pre-trained on [mC4](https://huggingface.co/datasets/mc4) excluding any supervised training.
-Therefore, this model has to be fine-tuned before it is useable on a downstream task, unlike the original T5 model.
+Therefore, this model has to be fine-tuned before it is usable on a downstream task, unlike the original T5 model.
 Since mT5 was pre-trained unsupervisedly, there's no real advantage to using a task prefix during single-task
 fine-tuning. If you are doing multi-task fine-tuning, you should use a prefix.

--- a/docs/source/en/model_doc/splinter.mdx
+++ b/docs/source/en/model_doc/splinter.mdx
@@ -41,7 +41,7 @@ Tips:
 - If you plan on using Splinter outside *run_qa.py*, please keep in mind the question token - it might be important for
  the success of your model, especially in a few-shot setting.
 - Please note there are two different checkpoints for each size of Splinter. Both are basically the same, except that
-  one also has the pretrained wights of the QASS layer (*tau/splinter-base-qass* and *tau/splinter-large-qass*) and one
+  one also has the pretrained weights of the QASS layer (*tau/splinter-base-qass* and *tau/splinter-large-qass*) and one
  doesn't (*tau/splinter-base* and *tau/splinter-large*). This is done to support randomly initializing this layer at
  fine-tuning, as it is shown to yield better results for some cases in the paper.

--- a/docs/source/en/model_doc/t5v1.1.mdx
+++ b/docs/source/en/model_doc/t5v1.1.mdx
@@ -39,7 +39,7 @@ T5 Version 1.1 includes the following improvements compared to the original T5 m
  `num_heads` and `d_ff`.

 Note: T5 Version 1.1 was only pre-trained on [C4](https://huggingface.co/datasets/c4) excluding any supervised
-training. Therefore, this model has to be fine-tuned before it is useable on a downstream task, unlike the original T5
+training. Therefore, this model has to be fine-tuned before it is usable on a downstream task, unlike the original T5
 model. Since t5v1.1 was pre-trained unsupervisedly, there's no real advantage to using a task prefix during single-task
 fine-tuning. If you are doing multi-task fine-tuning, you should use a prefix.

--- a/docs/source/en/model_doc/vision-text-dual-encoder.mdx
+++ b/docs/source/en/model_doc/vision-text-dual-encoder.mdx
@@ -21,7 +21,7 @@ downstream task. This model can be used to align the vision-text embeddings usin
 training and then can be used for zero-shot vision tasks such image-classification or retrieval.

 In [LiT: Zero-Shot Transfer with Locked-image Text Tuning](https://arxiv.org/abs/2111.07991) it is shown how
-leveraging pre-trained (locked/frozen) image and text model for contrastive learning yields significant improvment on
+leveraging pre-trained (locked/frozen) image and text model for contrastive learning yields significant improvement on
 new zero-shot vision tasks such as image classification or retrieval.

 ## VisionTextDualEncoderConfig