From 97527898da1097df83cb47256e64abf53c113adb Mon Sep 17 00:00:00 2001 From: Jacob Date: Mon, 12 Jun 2023 22:43:58 +0800 Subject: [PATCH] typo: fix typos in CONTRIBUTING.md and deepspeed.mdx (#24184) * typo: fix typos in CONTRIBUTING.md and deepspeed.mdx * Update CONTRIBUTING.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --- CONTRIBUTING.md | 4 ++-- docs/source/en/main_classes/deepspeed.mdx | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 9635ae09d7..1c2b896815 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -275,7 +275,7 @@ You'll need **[Python 3.7]((https://github.com/huggingface/transformers/blob/mai request description to make sure they are linked (and people viewing the issue know you are working on it).
☐ To indicate a work in progress please prefix the title with `[WIP]`. These are -useful to avoid duplicated work, and to differentiate it from PRs ready to be merged. +useful to avoid duplicated work, and to differentiate it from PRs ready to be merged.
☐ Make sure existing tests pass.
☐ If adding a new feature, also add tests for it.
- If you are adding a new model, make sure you use @@ -284,7 +284,7 @@ useful to avoid duplicated work, and to differentiate it from PRs ready to be me `RUN_SLOW=1 python -m pytest tests/models/my_new_model/test_my_new_model.py`. - If you are adding a new tokenizer, write tests and make sure `RUN_SLOW=1 python -m pytest tests/models/{your_model_name}/test_tokenization_{your_model_name}.py` passes. - CircleCI does not run the slow tests, but GitHub Actions does every night!
+ - CircleCI does not run the slow tests, but GitHub Actions does every night!
☐ All public methods must have informative docstrings (see [`modeling_bert.py`](https://github.com/huggingface/transformers/blob/main/src/transformers/models/bert/modeling_bert.py) diff --git a/docs/source/en/main_classes/deepspeed.mdx b/docs/source/en/main_classes/deepspeed.mdx index 2c78078d60..3733b0a1b8 100644 --- a/docs/source/en/main_classes/deepspeed.mdx +++ b/docs/source/en/main_classes/deepspeed.mdx @@ -760,7 +760,7 @@ time. "reuse distance" is a metric we are using to figure out when will a parame use the `stage3_max_reuse_distance` to decide whether to throw away the parameter or to keep it. If a parameter is going to be used again in near future (less than `stage3_max_reuse_distance`) then we keep it to reduce communication overhead. This is super helpful when you have activation checkpointing enabled, where we do a forward recompute and -backward passes a a single layer granularity and want to keep the parameter in the forward recompute till the backward +backward passes a single layer granularity and want to keep the parameter in the forward recompute till the backward The following configuration values depend on the model's hidden size: