* [wip] add get_polynomial_decay_schedule_with_warmup * style * add assert * change lr_end to a much smaller default number * check for exact equality * [model_cards] electra-base-turkish-cased-ner (#6350) * for electra-base-turkish-cased-ner * Add metadata Co-authored-by: Julien Chaumond <chaumond@gmail.com> * Temporarily de-activate TPU CI * Update modeling_tf_utils.py (#6372) fix typo: ckeckpoint->checkpoint * the test now works again (#6371) * correct pl link in readme (#6364) * refactor almost identical tests (#6339) * refactor almost identical tests * important to add a clear assert error message * make the assert error even more descriptive than the original bt * Small docfile fixes (#6328) * Patch models (#6326) * TFAlbertFor{TokenClassification, MultipleChoice} * Patch models * BERT and TF BERT info s * Update check_repo * Ci GitHub caching (#6382) * Cache Github Actions CI * Remove useless file * Colab button (#6389) * Add colab button * Add colab link for tutorials * Fix links for open in colab (#6391) * Update src/transformers/optimization.py consistently use lr_end=1e-7 default Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * [wip] add get_polynomial_decay_schedule_with_warmup * style * add assert * change lr_end to a much smaller default number * check for exact equality * Update src/transformers/optimization.py consistently use lr_end=1e-7 default Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * remove dup (leftover from merge) * convert the test into the new refactored format * stick to using the current_step as is, without ++ Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com> Co-authored-by: Julien Chaumond <chaumond@gmail.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Alexander Measure <ameasure@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Examples
Version 2.9 of 🤗 Transformers introduces a new Trainer class for PyTorch, and its equivalent TFTrainer for TF 2.
Running the examples requires PyTorch 1.3.1+ or TensorFlow 2.2+.
Here is the list of all our examples:
- grouped by task (all official examples work for multiple models)
- with information on whether they are built on top of
Trainer/TFTrainer(if not, they still work, they might just lack some features), - whether they also include examples for
pytorch-lightning, which is a great fully-featured, general-purpose training library for PyTorch, - links to Colab notebooks to walk through the scripts and run them easily,
- links to Cloud deployments to be able to deploy large-scale trainings in the Cloud with little to no setup.
This is still a work-in-progress – in particular documentation is still sparse – so please contribute improvements/pull requests.
The Big Table of Tasks
| Task | Example datasets | Trainer support | TFTrainer support | pytorch-lightning | Colab |
|---|---|---|---|---|---|
language-modeling |
Raw text | ✅ | - | - | |
text-classification |
GLUE, XNLI | ✅ | ✅ | ✅ | |
token-classification |
CoNLL NER | ✅ | ✅ | ✅ | - |
multiple-choice |
SWAG, RACE, ARC | ✅ | ✅ | - | |
question-answering |
SQuAD | ✅ | ✅ | - | - |
text-generation |
- | n/a | n/a | n/a | |
distillation |
All | - | - | - | - |
summarization |
CNN/Daily Mail | - | - | ✅ | - |
translation |
WMT | - | - | ✅ | - |
bertology |
- | - | - | - | - |
adversarial |
HANS | ✅ | - | - | - |
Important note
Important To make sure you can successfully run the latest versions of the example scripts, you have to install the library from source and install some example-specific requirements. Execute the following steps in a new virtual environment:
git clone https://github.com/huggingface/transformers
cd transformers
pip install .
pip install -r ./examples/requirements.txt
One-click Deploy to Cloud (wip)
Azure
Running on TPUs
When using Tensorflow, TPUs are supported out of the box as a tf.distribute.Strategy.
When using PyTorch, we support TPUs thanks to pytorch/xla. For more context and information on how to setup your TPU environment refer to Google's documentation and to the
very detailed pytorch/xla README.
In this repo, we provide a very simple launcher script named xla_spawn.py that lets you run our example scripts on multiple TPU cores without any boilerplate.
Just pass a --num_cores flag to this script, then your regular training script with its arguments (this is similar to the torch.distributed.launch helper for torch.distributed).
For example for run_glue:
python examples/xla_spawn.py --num_cores 8 \
examples/text-classification/run_glue.py
--model_name_or_path bert-base-cased \
--task_name mnli \
--data_dir ./data/glue_data/MNLI \
--output_dir ./models/tpu \
--overwrite_output_dir \
--do_train \
--do_eval \
--num_train_epochs 1 \
--save_steps 20000
Feedback and more use cases and benchmarks involving TPUs are welcome, please share with the community.
Logging & Experiment tracking
You can easily log and monitor your runs code. The following are currently supported:
Weights & Biases
To use Weights & Biases, install the wandb package with:
pip install wandb
Then log in the command line:
wandb login
If you are in Jupyter or Colab, you should login with:
import wandb
wandb.login()
Whenever you use Trainer or TFTrainer classes, your losses, evaluation metrics, model topology and gradients (for Trainer only) will automatically be logged.
When using 🤗 Transformers with PyTorch Lightning, runs can be tracked through WandbLogger. Refer to related documentation & examples.
Comet.ml
To use comet_ml, install the Python package with:
pip install comet_ml
or if in a Conda environment:
conda install -c comet_ml -c anaconda -c conda-forge comet_ml