Commit Graph

17 Commits

Author SHA1 Message Date
Sylvain Gugger
d2f9cb838e Fix in Adafactor docstrings (#6845) 2020-08-31 10:52:47 -04:00
Lysandre Debut
41aa2b4ef1 Adafactor docs (#6765) 2020-08-27 05:16:50 -04:00
Nikolai Yakovenko
971d1802d0 Add AdaFactor optimizer from fairseq (#6722)
* AdaFactor optimizer ported from fairseq. Tested for T5 finetuning and MLM -- reduced memory consumption compared to ADAM.

* update PR fixes, add basic test

* bug -- incorrect params in test

* bugfix -- import Adafactor into test

* bugfix -- removed accidental T5 include

* resetting T5 to master

* bugfix -- include Adafactor in __init__

* longer loop for adafactor test

* remove double error class declare

* lint

* black

* isort

* Update src/transformers/optimization.py

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* single docstring

* Cleanup docstring

Co-authored-by: Nikolai Y <nikolai.yakovenko@point72.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-08-27 04:58:13 -04:00
Lysandre Debut
77abd1e79f Centralize logging (#6434)
* Logging

* Style

* hf_logging > utils.logging

* Address @thomwolf's comments

* Update test

* Update src/transformers/benchmark/benchmark_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Revert bad change

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-08-26 11:10:36 -04:00
Stas Bekman
39c3b1d9de [sched] polynomial_decay_schedule use default power=1.0 (#6473) 2020-08-17 08:33:12 -04:00
Stas Bekman
ece0903e11 lr_schedulers: add get_polynomial_decay_schedule_with_warmup (#6361)
* [wip] add get_polynomial_decay_schedule_with_warmup

* style

* add assert

* change lr_end to a much smaller default number

* check for exact equality

* [model_cards] electra-base-turkish-cased-ner (#6350)

* for electra-base-turkish-cased-ner

* Add metadata

Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* Temporarily de-activate TPU CI

* Update modeling_tf_utils.py (#6372)

fix typo: ckeckpoint->checkpoint

* the test now works again (#6371)

* correct pl link in readme (#6364)

* refactor almost identical tests (#6339)

* refactor almost identical tests

* important to add a clear assert error message

* make the assert error even more descriptive than the original bt

* Small docfile fixes (#6328)

* Patch models (#6326)

* TFAlbertFor{TokenClassification, MultipleChoice}

* Patch models

* BERT and TF BERT info


s

* Update check_repo

* Ci GitHub caching (#6382)

* Cache Github Actions CI

* Remove useless file

* Colab button (#6389)

* Add colab button

* Add colab link for tutorials

* Fix links for open in colab (#6391)

* Update src/transformers/optimization.py

consistently use lr_end=1e-7 default

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* [wip] add get_polynomial_decay_schedule_with_warmup

* style

* add assert

* change lr_end to a much smaller default number

* check for exact equality

* Update src/transformers/optimization.py

consistently use lr_end=1e-7 default

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove dup (leftover from merge)

* convert the test into the new refactored format

* stick to using the current_step as is, without ++

Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com>
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Alexander Measure <ameasure@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-08-11 17:56:41 -04:00
Ninnart Fuengfusin
24c5a6e351 Update optimization.py (#6261) 2020-08-05 07:34:57 -04:00
Sylvain Gugger
4ade7491f4 Fix examples titles and optimization doc page (#5408) 2020-07-01 08:11:25 -04:00
Cola
eacea530c1 🚨 Remove warning of deprecation (#4477)
Remove warning of deprecated overload of addcdiv_

Fix #4451
2020-05-20 16:48:29 -04:00
Julien Chaumond
3e0f062106 Fix addcmul_ 2020-05-15 17:44:17 -04:00
Julien Chaumond
fc2a4c88ce Fix: one more try 2020-05-15 17:38:48 -04:00
Julien Chaumond
55bda52555 Same fix for addcmul_ 2020-05-15 17:23:48 -04:00
Julien Chaumond
ad02c961c6 Fix UserWarning: This overload of add_ is deprecated in pytorch==1.5.0 2020-05-15 17:09:11 -04:00
Julien Chaumond
83a41d39b3 💄 super 2020-01-15 18:33:50 -05:00
alberduris
81d6841b4b GPU text generation: mMoved the encoded_prompt to correct device 2020-01-06 15:11:12 +01:00
alberduris
dd4df80f0b Moved the encoded_prompts to correct device 2020-01-06 15:11:12 +01:00
Aymeric Augustin
6be7cdda66 Move source code inside a src subdirectory.
This prevents transformers from being importable simply because the CWD
is the root of the git repository, while not being importable from other
directories. That led to inconsistent behavior, especially in examples.

Once you fetch this commit, in your dev environment, you must run:

    $ pip uninstall transformers
    $ pip install -e .
2019-12-22 14:15:13 +01:00