From 5d8eb93eeec0476b9f0fddc96f2960be0ce782b6 Mon Sep 17 00:00:00 2001
From: hugo-syn <61210734+hugo-syn@users.noreply.github.com>
Date: Thu, 18 Jan 2024 14:35:09 +0100
Subject: [PATCH] chore: Fix multiple typos (#28574)

---
 examples/research_projects/codeparrot/README.md                 | 2 +-
 examples/research_projects/jax-projects/README.md               | 2 +-
 examples/research_projects/jax-projects/big_bird/README.md      | 2 +-
 .../research_projects/jax-projects/model_parallel/README.md     | 2 +-
 examples/research_projects/mlm_wwm/README.md                    | 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/examples/research_projects/codeparrot/README.md b/examples/research_projects/codeparrot/README.md
index 6c57c4350f..3259041ba5 100644
--- a/examples/research_projects/codeparrot/README.md
+++ b/examples/research_projects/codeparrot/README.md
@@ -50,7 +50,7 @@ The raw dataset contains many duplicates. We deduplicated and filtered the datas
 - fraction of alphanumeric characters < 0.25
 - containing the word "auto-generated" or similar in the first 5 lines
 - filtering with a probability of 0.7 of files with a mention of "test file" or "configuration file" or similar in the first 5 lines
-- filtering with a probability of 0.7 of files with high occurence of the keywords "test " or "config" 
+- filtering with a probability of 0.7 of files with high occurrence of the keywords "test " or "config" 
 - filtering with a probability of 0.7  of files without a mention of the keywords `def` , `for`, `while`  and `class`
 - filtering files that use the assignment operator `=` less than 5 times 
 - filtering files with ratio between number of characters and number of tokens after tokenization < 1.5 (the average ratio is 3.6)
diff --git a/examples/research_projects/jax-projects/README.md b/examples/research_projects/jax-projects/README.md
index 420a97f768..71f9a7a4e0 100644
--- a/examples/research_projects/jax-projects/README.md
+++ b/examples/research_projects/jax-projects/README.md
@@ -1153,7 +1153,7 @@ In the following, we will describe how to do so using a standard console, but yo
 2. Once you've installed the google cloud sdk, you should set your account by running the following command. Make sure that `<your-email-address>` corresponds to the gmail address you used to sign up for this event.
 
 ```bash
-$ gcloud config set account <your-email-adress>
+$ gcloud config set account <your-email-address>
 ```
 
 3. Let's also make sure the correct project is set in case your email is used for multiple gcloud projects:
diff --git a/examples/research_projects/jax-projects/big_bird/README.md b/examples/research_projects/jax-projects/big_bird/README.md
index e8ef274bbe..42586e4958 100644
--- a/examples/research_projects/jax-projects/big_bird/README.md
+++ b/examples/research_projects/jax-projects/big_bird/README.md
@@ -57,4 +57,4 @@ wget https://huggingface.co/datasets/vasudevgupta/natural-questions-validation/r
 python3 evaluate.py
 ```
 
-You can find our checkpoint on HuggingFace Hub ([see this](https://huggingface.co/vasudevgupta/flax-bigbird-natural-questions)). In case you are interested in PyTorch BigBird fine-tuning, you can refer to [this repositary](https://github.com/thevasudevgupta/bigbird).
+You can find our checkpoint on HuggingFace Hub ([see this](https://huggingface.co/vasudevgupta/flax-bigbird-natural-questions)). In case you are interested in PyTorch BigBird fine-tuning, you can refer to [this repository](https://github.com/thevasudevgupta/bigbird).
diff --git a/examples/research_projects/jax-projects/model_parallel/README.md b/examples/research_projects/jax-projects/model_parallel/README.md
index b63b93862d..97f3cdb047 100644
--- a/examples/research_projects/jax-projects/model_parallel/README.md
+++ b/examples/research_projects/jax-projects/model_parallel/README.md
@@ -27,7 +27,7 @@ To adapt the script for other models, we need to also change the `ParitionSpec`
 
 TODO: Add more explantion.
 
-Before training, let's prepare our model first. To be able to shard the model, the sharded dimention needs to be a multiple of devices it'll be sharded on. But GPTNeo's vocab size is 50257, so we need to resize the embeddings accordingly. 
+Before training, let's prepare our model first. To be able to shard the model, the sharded dimension needs to be a multiple of devices it'll be sharded on. But GPTNeo's vocab size is 50257, so we need to resize the embeddings accordingly. 
 
 ```python
 from transformers import FlaxGPTNeoForCausalLM, GPTNeoConfig 
diff --git a/examples/research_projects/mlm_wwm/README.md b/examples/research_projects/mlm_wwm/README.md
index 9426be7c27..0144b1ad30 100644
--- a/examples/research_projects/mlm_wwm/README.md
+++ b/examples/research_projects/mlm_wwm/README.md
@@ -95,4 +95,4 @@ python run_mlm_wwm.py \
 
 **Note1:** On TPU, you should the flag `--pad_to_max_length` to make sure all your batches have the same length.
 
-**Note2:** And if you have any questions or something goes wrong when runing this code, don't hesitate to pin @wlhgtc.
+**Note2:** And if you have any questions or something goes wrong when running this code, don't hesitate to pin @wlhgtc.