* Fix bad examples
* Add black formatting to style_doc
* Use first nonempty line
* Put it at the right place
* Don't add spaces to empty lines
* Better templates
* Deal with triple quotes in docstrings
* Result of style_doc
* Enable mdx treatment and fix code examples in MDXs
* Result of doc styler on doc source files
* Last fixes
* Break copy from
* New doc styler
* Fix issue with args at the start
* Code sample fixes
* Style code examples in MDX
* Fix more patterns
* Typo
* Typo
* More patterns
* Do without black for now
* Get more info in error
* Docstring style
* Re-enable check
* Quality
* Fix add_end_docstring decorator
* Fix docstring
* Convert docstrings of all configurations and tokenizers
* Processors and fixes
* Last modeling files and fixes to models
* Pipeline modules
* Utils files
* Data submodule
* All the other files
* Style
* Missing examples
* Style again
* Fix copies
* Say bye bye to rst docstrings forever
This affects Adafactor with relative_step=False and scale_parameter=True.
Updating group["lr"] makes the result of ._get_lr() depends on the previous call,
i.e., on the scale of other parameters. This isn't supposed to happen.
* Add label smoothing in Trainer
* Add options for scheduler and Adafactor in Trainer
* Put Seq2SeqTrainer in the main lib
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Address review comments and adapt scripts
* Documentation
* Move test not using script to tests folder
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Important files
* Styling them all
* Revert "Styling them all"
This reverts commit 7d029395fdae8513b8281cbc2a6c239f8093503e.
* Syling them for realsies
* Fix syntax error
* Fix benchmark_utils
* More fixes
* Fix modeling auto and script
* Remove new line
* Fixes
* More fixes
* Fix more files
* Style
* Add FSMT
* More fixes
* More fixes
* More fixes
* More fixes
* Fixes
* More fixes
* More fixes
* Last fixes
* Make sphinx happy
* AdaFactor optimizer ported from fairseq. Tested for T5 finetuning and MLM -- reduced memory consumption compared to ADAM.
* update PR fixes, add basic test
* bug -- incorrect params in test
* bugfix -- import Adafactor into test
* bugfix -- removed accidental T5 include
* resetting T5 to master
* bugfix -- include Adafactor in __init__
* longer loop for adafactor test
* remove double error class declare
* lint
* black
* isort
* Update src/transformers/optimization.py
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
* single docstring
* Cleanup docstring
Co-authored-by: Nikolai Y <nikolai.yakovenko@point72.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
* [wip] add get_polynomial_decay_schedule_with_warmup
* style
* add assert
* change lr_end to a much smaller default number
* check for exact equality
* [model_cards] electra-base-turkish-cased-ner (#6350)
* for electra-base-turkish-cased-ner
* Add metadata
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
* Temporarily de-activate TPU CI
* Update modeling_tf_utils.py (#6372)
fix typo: ckeckpoint->checkpoint
* the test now works again (#6371)
* correct pl link in readme (#6364)
* refactor almost identical tests (#6339)
* refactor almost identical tests
* important to add a clear assert error message
* make the assert error even more descriptive than the original bt
* Small docfile fixes (#6328)
* Patch models (#6326)
* TFAlbertFor{TokenClassification, MultipleChoice}
* Patch models
* BERT and TF BERT info
s
* Update check_repo
* Ci GitHub caching (#6382)
* Cache Github Actions CI
* Remove useless file
* Colab button (#6389)
* Add colab button
* Add colab link for tutorials
* Fix links for open in colab (#6391)
* Update src/transformers/optimization.py
consistently use lr_end=1e-7 default
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* [wip] add get_polynomial_decay_schedule_with_warmup
* style
* add assert
* change lr_end to a much smaller default number
* check for exact equality
* Update src/transformers/optimization.py
consistently use lr_end=1e-7 default
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* remove dup (leftover from merge)
* convert the test into the new refactored format
* stick to using the current_step as is, without ++
Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com>
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Alexander Measure <ameasure@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
This prevents transformers from being importable simply because the CWD
is the root of the git repository, while not being importable from other
directories. That led to inconsistent behavior, especially in examples.
Once you fetch this commit, in your dev environment, you must run:
$ pip uninstall transformers
$ pip install -e .