Files

Shashank Gupta 3dcb748e31 Added data collator for permutation (XLNet) language modeling and related calls (#5522 )

* Added data collator for XLNet language modeling and related calls

Added DataCollatorForXLNetLanguageModeling in data/data_collator.py
to generate necessary inputs for language modeling training with
XLNetLMHeadModel. Also added related arguments, logic and calls in
examples/language-modeling/run_language_modeling.py.

Resolves: #4739, #2008 (partially)

* Changed name to `DataCollatorForPermutationLanguageModeling`

Changed the name of `DataCollatorForXLNetLanguageModeling` to the more general `DataCollatorForPermutationLanguageModelling`.
Removed the `--mlm` flag requirement for the new collator and defined a separate `--plm_probability` flag for its use.
CTRL uses a CLM loss just like GPT and GPT-2, so should work out of the box with this script (provided `past` is taken care of
similar to `mems` for XLNet).
Changed calls and imports appropriately.

* Added detailed comments, changed variable names

Added more detailed comments to `DataCollatorForPermutationLanguageModeling` in `data/data_collator.py` to explain working. Also cleaned up variable names and made them more informative.

* Added tests for new data collator

Added tests in `tests/test_trainer.py` for DataCollatorForPermutationLanguageModeling based on those in DataCollatorForLanguageModeling. A specific test has been added to check for odd-length sequences.

* Fixed styling issues

2020-07-07 10:17:37 +02:00

adversarial

[tokenizers] Updates data processors, docstring, examples and model cards to the new API (#5308 )

2020-06-26 19:48:14 +02:00

benchmarking

[Docs] Benchmark docs (#5360 )

2020-06-29 16:08:57 +02:00

bert-loses-patience

save_pretrained: mkdir(exist_ok=True) (#5258 )

2020-06-28 14:53:47 -04:00

bertology

Make DataCollator a callable (#5015 )

2020-06-15 11:58:33 -04:00

contrib

save_pretrained: mkdir(exist_ok=True) (#5258 )

2020-06-28 14:53:47 -04:00

distillation

save_pretrained: mkdir(exist_ok=True) (#5258 )

2020-06-28 14:53:47 -04:00

language-modeling

Added data collator for permutation (XLNet) language modeling and related calls (#5522 )

2020-07-07 10:17:37 +02:00

longform-qa

[tokenizers] Updates data processors, docstring, examples and model cards to the new API (#5308 )

2020-06-26 19:48:14 +02:00

movement-pruning

save_pretrained: mkdir(exist_ok=True) (#5258 )

2020-06-28 14:53:47 -04:00

multiple-choice

Clean up diffs in Trainer/TFTrainer (#5417 )

2020-07-01 11:00:20 -04:00

question-answering

Clean up diffs in Trainer/TFTrainer (#5417 )

2020-07-01 11:00:20 -04:00

seq2seq

Move tests/utils.py -> transformers/testing_utils.py (#5350 )

2020-07-01 10:31:17 -04:00

text-classification

Clean up diffs in Trainer/TFTrainer (#5417 )

2020-07-01 11:00:20 -04:00

text-generation

The add_space_before_punct_symbol is only for TransfoXL (#5549 )

2020-07-06 12:17:05 -04:00

token-classification

Clean up diffs in Trainer/TFTrainer (#5417 )

2020-07-01 11:00:20 -04:00

lightning_base.py

[pl_examples] default warmup steps=0 (#5316 )

2020-06-26 15:03:41 -04:00

README.md

Fix examples titles and optimization doc page (#5408 )

2020-07-01 08:11:25 -04:00

requirements.txt

Upgrade examples to pl=0.8.1(#5146 )

2020-06-22 20:40:10 -04:00

test_examples.py

[cleanup] examples test_run_squad uses tiny model (#5059 )

2020-06-16 14:06:45 -04:00

xla_spawn.py

[TPU] Doc, fix xla_spawn.py, only preprocess dataset once (#4223 )

2020-05-08 14:10:05 -04:00

README.md

Examples

Version 2.9 of 🤗 Transformers introduces a new Trainer class for PyTorch, and its equivalent TFTrainer for TF 2. Running the examples requires PyTorch 1.3.1+ or TensorFlow 2.1+.

Here is the list of all our examples:

grouped by task (all official examples work for multiple models)
with information on whether they are built on top of Trainer/TFTrainer (if not, they still work, they might just lack some features),
whether they also include examples for pytorch-lightning, which is a great fully-featured, general-purpose training library for PyTorch,
links to Colab notebooks to walk through the scripts and run them easily,
links to Cloud deployments to be able to deploy large-scale trainings in the Cloud with little to no setup.

This is still a work-in-progress – in particular documentation is still sparse – so please contribute improvements/pull requests.

The Big Table of Tasks

Task	Example datasets	Trainer support	TFTrainer support	pytorch-lightning	Colab
`language-modeling`	Raw text	✅	-	-
`text-classification`	GLUE, XNLI	✅	✅	✅
`token-classification`	CoNLL NER	✅	✅	✅	-
`multiple-choice`	SWAG, RACE, ARC	✅	✅	-
`question-answering`	SQuAD	-	✅	-	-
`text-generation`	-	n/a	n/a	n/a
`distillation`	All	-	-	-	-
`summarization`	CNN/Daily Mail	-	-	✅	-
`translation`	WMT	-	-	✅	-
`bertology`	-	-	-	-	-
`adversarial`	HANS	✅	-	-	-

Important note

Important To make sure you can successfully run the latest versions of the example scripts, you have to install the library from source and install some example-specific requirements. Execute the following steps in a new virtual environment:

git clone https://github.com/huggingface/transformers
cd transformers
pip install .
pip install -r ./examples/requirements.txt

One-click Deploy to Cloud (wip)

Azure

Running on TPUs

When using Tensorflow, TPUs are supported out of the box as a tf.distribute.Strategy.

When using PyTorch, we support TPUs thanks to pytorch/xla. For more context and information on how to setup your TPU environment refer to Google's documentation and to the very detailed pytorch/xla README.

In this repo, we provide a very simple launcher script named xla_spawn.py that lets you run our example scripts on multiple TPU cores without any boilerplate. Just pass a --num_cores flag to this script, then your regular training script with its arguments (this is similar to the torch.distributed.launch helper for torch.distributed).

For example for run_glue:

python examples/xla_spawn.py --num_cores 8 \
	examples/text-classification/run_glue.py
	--model_name_or_path bert-base-cased \
	--task_name mnli \
	--data_dir ./data/glue_data/MNLI \
	--output_dir ./models/tpu \
	--overwrite_output_dir \
	--do_train \
	--do_eval \
	--num_train_epochs 1 \
	--save_steps 20000

Feedback and more use cases and benchmarks involving TPUs are welcome, please share with the community.

README.md Unescape Escape

Examples

The Big Table of Tasks

Important note

One-click Deploy to Cloud (wip)

Azure

Running on TPUs

README.md