[Docs] Fix spelling and grammar mistakes (#28825)
* Fix typos and grammar mistakes in docs and examples * Fix typos in docstrings and comments * Fix spelling of `tokenizer` in model tests * Remove erroneous spaces in decorators * Remove extra spaces in Markdown link texts
This commit is contained in:
@@ -449,7 +449,7 @@ are 8 TPU cores on 4 chips (each chips has 2 cores), while "8 GPU" are 8 GPU chi
|
||||
|
||||
For comparison one can run the same pre-training with PyTorch/XLA on TPU. To set up PyTorch/XLA on Cloud TPU VMs, please
|
||||
refer to [this](https://cloud.google.com/tpu/docs/pytorch-xla-ug-tpu-vm) guide.
|
||||
Having created the tokenzier and configuration in `norwegian-roberta-base`, we create the following symbolic links:
|
||||
Having created the tokenizer and configuration in `norwegian-roberta-base`, we create the following symbolic links:
|
||||
|
||||
```bash
|
||||
ln -s ~/transformers/examples/pytorch/language-modeling/run_mlm.py ./
|
||||
@@ -499,7 +499,7 @@ python3 xla_spawn.py --num_cores ${NUM_TPUS} run_mlm.py --output_dir="./runs" \
|
||||
|
||||
For comparison you can run the same pre-training with PyTorch on GPU. Note that we have to make use of `gradient_accumulation`
|
||||
because the maximum batch size that fits on a single V100 GPU is 32 instead of 128.
|
||||
Having created the tokenzier and configuration in `norwegian-roberta-base`, we create the following symbolic links:
|
||||
Having created the tokenizer and configuration in `norwegian-roberta-base`, we create the following symbolic links:
|
||||
|
||||
```bash
|
||||
ln -s ~/transformers/examples/pytorch/language-modeling/run_mlm.py ./
|
||||
|
||||
@@ -674,7 +674,7 @@ def main():
|
||||
raise ValueError("--do_train requires a train dataset")
|
||||
train_dataset = raw_datasets["train"]
|
||||
if data_args.max_train_samples is not None:
|
||||
# We will select sample from whole data if agument is specified
|
||||
# We will select sample from whole data if argument is specified
|
||||
max_train_samples = min(len(train_dataset), data_args.max_train_samples)
|
||||
train_dataset = train_dataset.select(range(max_train_samples))
|
||||
# Create train feature from dataset
|
||||
|
||||
@@ -62,7 +62,7 @@ from transformers.utils.versions import require_version
|
||||
# Will error if the minimal version of Transformers is not installed. Remove at your own risk.
|
||||
check_min_version("4.38.0.dev0")
|
||||
|
||||
require_version("datasets>=2.14.0", "To fix: pip install -r examples/flax/speech-recogintion/requirements.txt")
|
||||
require_version("datasets>=2.14.0", "To fix: pip install -r examples/flax/speech-recognition/requirements.txt")
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@@ -330,7 +330,7 @@ def main():
|
||||
|
||||
# Initialize datasets and pre-processing transforms
|
||||
# We use torchvision here for faster pre-processing
|
||||
# Note that here we are using some default pre-processing, for maximum accuray
|
||||
# Note that here we are using some default pre-processing, for maximum accuracy
|
||||
# one should tune this part and carefully select what transformations to use.
|
||||
normalize = transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
|
||||
train_dataset = torchvision.datasets.ImageFolder(
|
||||
|
||||
Reference in New Issue
Block a user