Update all references to canonical models (#29001)
* Script & Manual edition * Update
This commit is contained in:
@@ -41,7 +41,7 @@ and you also will find examples of these below.
|
||||
Here is an example on a summarization task:
|
||||
```bash
|
||||
python examples/pytorch/summarization/run_summarization.py \
|
||||
--model_name_or_path t5-small \
|
||||
--model_name_or_path google-t5/t5-small \
|
||||
--do_train \
|
||||
--do_eval \
|
||||
--dataset_name cnn_dailymail \
|
||||
@@ -54,9 +54,9 @@ python examples/pytorch/summarization/run_summarization.py \
|
||||
--predict_with_generate
|
||||
```
|
||||
|
||||
Only T5 models `t5-small`, `t5-base`, `t5-large`, `t5-3b` and `t5-11b` must use an additional argument: `--source_prefix "summarize: "`.
|
||||
Only T5 models `google-t5/t5-small`, `google-t5/t5-base`, `google-t5/t5-large`, `google-t5/t5-3b` and `google-t5/t5-11b` must use an additional argument: `--source_prefix "summarize: "`.
|
||||
|
||||
We used CNN/DailyMail dataset in this example as `t5-small` was trained on it and one can get good scores even when pre-training with a very small sample.
|
||||
We used CNN/DailyMail dataset in this example as `google-t5/t5-small` was trained on it and one can get good scores even when pre-training with a very small sample.
|
||||
|
||||
Extreme Summarization (XSum) Dataset is another commonly used dataset for the task of summarization. To use it replace `--dataset_name cnn_dailymail --dataset_config "3.0.0"` with `--dataset_name xsum`.
|
||||
|
||||
@@ -65,7 +65,7 @@ And here is how you would use it on your own files, after adjusting the values f
|
||||
|
||||
```bash
|
||||
python examples/pytorch/summarization/run_summarization.py \
|
||||
--model_name_or_path t5-small \
|
||||
--model_name_or_path google-t5/t5-small \
|
||||
--do_train \
|
||||
--do_eval \
|
||||
--train_file path_to_csv_or_jsonlines_file \
|
||||
@@ -156,7 +156,7 @@ then
|
||||
|
||||
```bash
|
||||
python run_summarization_no_trainer.py \
|
||||
--model_name_or_path t5-small \
|
||||
--model_name_or_path google-t5/t5-small \
|
||||
--dataset_name cnn_dailymail \
|
||||
--dataset_config "3.0.0" \
|
||||
--source_prefix "summarize: " \
|
||||
@@ -179,7 +179,7 @@ that will check everything is ready for training. Finally, you can launch traini
|
||||
|
||||
```bash
|
||||
accelerate launch run_summarization_no_trainer.py \
|
||||
--model_name_or_path t5-small \
|
||||
--model_name_or_path google-t5/t5-small \
|
||||
--dataset_name cnn_dailymail \
|
||||
--dataset_config "3.0.0" \
|
||||
--source_prefix "summarize: " \
|
||||
|
||||
@@ -368,11 +368,11 @@ def main():
|
||||
logger.info(f"Training/evaluation parameters {training_args}")
|
||||
|
||||
if data_args.source_prefix is None and model_args.model_name_or_path in [
|
||||
"t5-small",
|
||||
"t5-base",
|
||||
"t5-large",
|
||||
"t5-3b",
|
||||
"t5-11b",
|
||||
"google-t5/t5-small",
|
||||
"google-t5/t5-base",
|
||||
"google-t5/t5-large",
|
||||
"google-t5/t5-3b",
|
||||
"google-t5/t5-11b",
|
||||
]:
|
||||
logger.warning(
|
||||
"You're running a t5 model but didn't provide a source prefix, which is the expected, e.g. with "
|
||||
|
||||
@@ -339,11 +339,11 @@ def main():
|
||||
|
||||
accelerator = Accelerator(gradient_accumulation_steps=args.gradient_accumulation_steps, **accelerator_log_kwargs)
|
||||
if args.source_prefix is None and args.model_name_or_path in [
|
||||
"t5-small",
|
||||
"t5-base",
|
||||
"t5-large",
|
||||
"t5-3b",
|
||||
"t5-11b",
|
||||
"google-t5/t5-small",
|
||||
"google-t5/t5-base",
|
||||
"google-t5/t5-large",
|
||||
"google-t5/t5-3b",
|
||||
"google-t5/t5-11b",
|
||||
]:
|
||||
logger.warning(
|
||||
"You're running a t5 model but didn't provide a source prefix, which is the expected, e.g. with "
|
||||
|
||||
Reference in New Issue
Block a user