* make sure that logging_first_step evaluates
* fix bug with incorrect loss on logging_first_step
* fix style
* logging_first_step only logs, not evals
* Important files
* Styling them all
* Revert "Styling them all"
This reverts commit 7d029395fdae8513b8281cbc2a6c239f8093503e.
* Syling them for realsies
* Fix syntax error
* Fix benchmark_utils
* More fixes
* Fix modeling auto and script
* Remove new line
* Fixes
* More fixes
* Fix more files
* Style
* Add FSMT
* More fixes
* More fixes
* More fixes
* More fixes
* Fixes
* More fixes
* More fixes
* Last fixes
* Make sphinx happy
* Initial callback proposal
* Finish various callbacks
* Post-rebase conflicts
* Fix tests
* Don't use something that's not set
* Documentation
* Remove unwanted print.
* Document all models can work
* Add tests + small fixes
* Update docs/source/internal/trainer_utils.rst
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Address review comments
* Fix TF tests
* Real fix this time
* This one should work
* Fix typo
* Really fix typo
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Add dataloader_num_workers to TrainingArguments
This argument is meant to be used to set the
number of workers for the PyTorch DataLoader.
* Pass num_workers argument on DataLoader init
* Add optuna hyperparameter search to Trainer
* @julien-c suggestions
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
* Make compute_objective an arg function
* Formatting
* Rework to make it easier to add ray
* Formatting
* Initial support for Ray
* Formatting
* Polish and finalize
* Add trial id to checkpoint with Ray
* Smaller default
* Use GPU in ray if available
* Formatting
* Fix test
* Update install instruction
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* Address review comments
* Formatting post-merge
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* Add support for past states
* Style and forgotten self
* You mean, documenting is not enough? I have to actually add it too?
* Add memory support during evaluation
* Fix tests in eval and add TF support
* No need to change this line anymore
* add tpu and torchscipt for benchmark
* fix name in tests
* "fix email"
* make style
* better log message for tpu
* add more print and info for tpu
* allow possibility to print tpu metrics
* correct cpu usage
* fix test for non-install
* remove bugus file
* include psutil in testing
* run a couple of times before tracing in torchscript
* do not allow tpu memory tracing for now
* make style
* add torchscript to env
* better name for torch tpu
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
* Better None gradients handling
* Apply Style
* Apply Style
* Create a loss class per task to compute its respective loss
* Add loss classes to the ALBERT TF models
* Add loss classes to the BERT TF models
* Add question answering and multiple choice to TF Camembert
* Remove prints
* Add multiple choice model to TF DistilBERT + loss computation
* Add question answering model to TF Electra + loss computation
* Add token classification, question answering and multiple choice models to TF Flaubert
* Add multiple choice model to TF Roberta + loss computation
* Add multiple choice model to TF XLM + loss computation
* Add multiple choice and question answering models to TF XLM-Roberta
* Add multiple choice model to TF XLNet + loss computation
* Remove unused parameters
* Add task loss classes
* Reorder TF imports + add new model classes
* Add new model classes
* Bugfix in TF T5 model
* Bugfix for TF T5 tests
* Bugfix in TF T5 model
* Fix TF T5 model tests
* Fix T5 tests + some renaming
* Fix inheritance issue in the AutoX tests
* Add tests for TF Flaubert and TF XLM Roberta
* Add tests for TF Flaubert and TF XLM Roberta
* Remove unused piece of code in the TF trainer
* bugfix and remove unused code
* Bugfix for TF 2.2
* Apply Style
* Divide TFSequenceClassificationAndMultipleChoiceLoss into their two respective name
* Apply style
* Mirror the PT Trainer in the TF one: fp16, optimizers and tb_writer as class parameter and better dataset handling
* Fix TF optimizations tests and apply style
* Remove useless parameter
* Bugfix and apply style
* Fix TF Trainer prediction
* Now the TF models return the loss such as their PyTorch couterparts
* Apply Style
* Ignore some tests output
* Take into account the SQuAD cls_index, p_mask and is_impossible parameters for the QuestionAnswering task models.
* Fix names for SQuAD data
* Apply Style
* Fix conflicts with 2.11 release
* Fix conflicts with 2.11
* Fix wrongname
* Add better documentation on the new create_optimizer function
* Fix isort
* logging_dir: use same default as PyTorch
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
* doc
* [tests] Add sample files for a regression task
* [HUGE] Trainer
* Feedback from @sshleifer
* Feedback from @thomwolf + logging tweak
* [file_utils] when downloading concurrently, get_from_cache will use the cached file for subsequent processes
* [glue] Use default max_seq_length of 128 like before
* [glue] move DataTrainingArguments around
* [ner] Change interface of InputExample, and align run_{tf,pl}
* Re-align the pl scripts a little bit
* ner
* [ner] Add integration test
* Fix language_modeling with API tweak
* [ci] Tweak loss target
* Don't break console output
* amp.initialize: model must be on right device before
* [multiple-choice] update for Trainer
* Re-align to 827d6d6ef0
* [examples] Generate argparsers from type hints on dataclasses
* [HfArgumentParser] way simpler API
* Restore run_language_modeling.py for easier diff
* [HfArgumentParser] final tweaks from code review