docs: Resolve many typos in the English docs (#20088)
* docs: Fix typo in ONNX parser help: 'tolerence' => 'tolerance' * docs: Resolve many typos in the English docs Typos found via 'codespell ./docs/source/en'
This commit is contained in:
@@ -32,7 +32,7 @@ A list of official Hugging Face and community (indicated by 🌎) resources to h
|
||||
|
||||
<PipelineTag pipeline="text-generation"/>
|
||||
|
||||
- [`BloomForCausalLM`] is suppported by this [causal language modeling example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling#gpt-2gpt-and-causal-language-modeling) and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling.ipynb).
|
||||
- [`BloomForCausalLM`] is supported by this [causal language modeling example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling#gpt-2gpt-and-causal-language-modeling) and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling.ipynb).
|
||||
|
||||
⚡️ Inference
|
||||
- A blog on [Optimization story: Bloom inference](https://huggingface.co/blog/bloom-inference-optimization).
|
||||
|
||||
@@ -61,7 +61,7 @@ A list of official Hugging Face and community (indicated by 🌎) resources to h
|
||||
- A notebook on how to [finetune GPT2 to generate lyrics in the style of your favorite artist](https://colab.research.google.com/github/AlekseyKorshuk/huggingartists/blob/master/huggingartists-demo.ipynb). 🌎
|
||||
- A notebook on how to [finetune GPT2 to generate tweets in the style of your favorite Twitter user](https://colab.research.google.com/github/borisdayma/huggingtweets/blob/master/huggingtweets-demo.ipynb). 🌎
|
||||
- [Causal language modeling](https://huggingface.co/course/en/chapter7/6?fw=pt#training-a-causal-language-model-from-scratch) chapter of the 🤗 Hugging Face Course.
|
||||
- [`GPT2LMHeadModel`] is suppported by this [causal language modeling example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling#gpt-2gpt-and-causal-language-modeling), [text generation example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/text-generation), and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling.ipynb).
|
||||
- [`GPT2LMHeadModel`] is supported by this [causal language modeling example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling#gpt-2gpt-and-causal-language-modeling), [text generation example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/text-generation), and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling.ipynb).
|
||||
- [`TFGPT2LMHeadModel`] is supported by this [causal language modeling example script](https://github.com/huggingface/transformers/tree/main/examples/tensorflow/language-modeling#run_clmpy) and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling-tf.ipynb).
|
||||
- [`FlaxGPT2LMHeadModel`] is supported by this [causal language modeling example script](https://github.com/huggingface/transformers/tree/main/examples/flax/language-modeling#causal-language-modeling) and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/causal_language_modeling_flax.ipynb).
|
||||
|
||||
|
||||
@@ -47,7 +47,7 @@ Tips:
|
||||
that could be found [here](https://github.com/kingoflolz/mesh-transformer-jax/blob/master/howto_finetune.md)
|
||||
|
||||
- Although the embedding matrix has a size of 50400, only 50257 entries are used by the GPT-2 tokenizer. These extra
|
||||
tokens are added for the sake of efficiency on TPUs. To avoid the mis-match between embedding matrix size and vocab
|
||||
tokens are added for the sake of efficiency on TPUs. To avoid the mismatch between embedding matrix size and vocab
|
||||
size, the tokenizer for [GPT-J](https://huggingface.co/EleutherAI/gpt-j-6B) contains 143 extra tokens
|
||||
`<|extratoken_1|>... <|extratoken_143|>`, so the `vocab_size` of tokenizer also becomes 50400.
|
||||
|
||||
|
||||
@@ -40,7 +40,7 @@ This model was contributed by [beltagy](https://huggingface.co/beltagy). The Aut
|
||||
|
||||
Longformer self attention employs self attention on both a "local" context and a "global" context. Most tokens only
|
||||
attend "locally" to each other meaning that each token attends to its \\(\frac{1}{2} w\\) previous tokens and
|
||||
\\(\frac{1}{2} w\\) succeding tokens with \\(w\\) being the window length as defined in
|
||||
\\(\frac{1}{2} w\\) succeeding tokens with \\(w\\) being the window length as defined in
|
||||
`config.attention_window`. Note that `config.attention_window` can be of type `List` to define a
|
||||
different \\(w\\) for each layer. A selected few tokens attend "globally" to all other tokens, as it is
|
||||
conventionally done for all tokens in `BertSelfAttention`.
|
||||
|
||||
@@ -26,7 +26,7 @@ Tips:
|
||||
- One can use [`MobileViTFeatureExtractor`] to prepare images for the model. Note that if you do your own preprocessing, the pretrained checkpoints expect images to be in BGR pixel order (not RGB).
|
||||
- The available image classification checkpoints are pre-trained on [ImageNet-1k](https://huggingface.co/datasets/imagenet-1k) (also referred to as ILSVRC 2012, a collection of 1.3 million images and 1,000 classes).
|
||||
- The segmentation model uses a [DeepLabV3](https://arxiv.org/abs/1706.05587) head. The available semantic segmentation checkpoints are pre-trained on [PASCAL VOC](http://host.robots.ox.ac.uk/pascal/VOC/).
|
||||
- As the name suggests MobileViT was desgined to be performant and efficient on mobile phones. The TensorFlow versions of the MobileViT models are fully compatible with [TensorFlow Lite](https://www.tensorflow.org/lite).
|
||||
- As the name suggests MobileViT was designed to be performant and efficient on mobile phones. The TensorFlow versions of the MobileViT models are fully compatible with [TensorFlow Lite](https://www.tensorflow.org/lite).
|
||||
|
||||
You can use the following code to convert a MobileViT checkpoint (be it image classification or semantic segmentation) to generate a
|
||||
TensorFlow Lite model:
|
||||
|
||||
@@ -28,7 +28,7 @@ generative model chooses to (partially) translate its prediction into the wrong
|
||||
checkpoints used in this work are publicly available.*
|
||||
|
||||
Note: mT5 was only pre-trained on [mC4](https://huggingface.co/datasets/mc4) excluding any supervised training.
|
||||
Therefore, this model has to be fine-tuned before it is useable on a downstream task, unlike the original T5 model.
|
||||
Therefore, this model has to be fine-tuned before it is usable on a downstream task, unlike the original T5 model.
|
||||
Since mT5 was pre-trained unsupervisedly, there's no real advantage to using a task prefix during single-task
|
||||
fine-tuning. If you are doing multi-task fine-tuning, you should use a prefix.
|
||||
|
||||
|
||||
@@ -41,7 +41,7 @@ Tips:
|
||||
- If you plan on using Splinter outside *run_qa.py*, please keep in mind the question token - it might be important for
|
||||
the success of your model, especially in a few-shot setting.
|
||||
- Please note there are two different checkpoints for each size of Splinter. Both are basically the same, except that
|
||||
one also has the pretrained wights of the QASS layer (*tau/splinter-base-qass* and *tau/splinter-large-qass*) and one
|
||||
one also has the pretrained weights of the QASS layer (*tau/splinter-base-qass* and *tau/splinter-large-qass*) and one
|
||||
doesn't (*tau/splinter-base* and *tau/splinter-large*). This is done to support randomly initializing this layer at
|
||||
fine-tuning, as it is shown to yield better results for some cases in the paper.
|
||||
|
||||
|
||||
@@ -39,7 +39,7 @@ T5 Version 1.1 includes the following improvements compared to the original T5 m
|
||||
`num_heads` and `d_ff`.
|
||||
|
||||
Note: T5 Version 1.1 was only pre-trained on [C4](https://huggingface.co/datasets/c4) excluding any supervised
|
||||
training. Therefore, this model has to be fine-tuned before it is useable on a downstream task, unlike the original T5
|
||||
training. Therefore, this model has to be fine-tuned before it is usable on a downstream task, unlike the original T5
|
||||
model. Since t5v1.1 was pre-trained unsupervisedly, there's no real advantage to using a task prefix during single-task
|
||||
fine-tuning. If you are doing multi-task fine-tuning, you should use a prefix.
|
||||
|
||||
|
||||
@@ -21,7 +21,7 @@ downstream task. This model can be used to align the vision-text embeddings usin
|
||||
training and then can be used for zero-shot vision tasks such image-classification or retrieval.
|
||||
|
||||
In [LiT: Zero-Shot Transfer with Locked-image Text Tuning](https://arxiv.org/abs/2111.07991) it is shown how
|
||||
leveraging pre-trained (locked/frozen) image and text model for contrastive learning yields significant improvment on
|
||||
leveraging pre-trained (locked/frozen) image and text model for contrastive learning yields significant improvement on
|
||||
new zero-shot vision tasks such as image classification or retrieval.
|
||||
|
||||
## VisionTextDualEncoderConfig
|
||||
|
||||
Reference in New Issue
Block a user