HuggingFace_transformer

Author	SHA1	Message	Date
Sylvain Gugger	afe5d42d8d	Black preview (#17217 ) * Black preview * Fixup too! * Fix check copies * Use the same version as the CI * Bump black	2022-05-12 16:25:55 -04:00
Santiago Castro	e3d1a8dabc	Add a missing space in a deprecation message (#15651 )	2022-02-15 19:12:30 -05:00
Lysandre Debut	7732d0fe7a	Upgrade black to version ~=22.0 (#15565 ) * Upgrade black to version ~=22.0 * Check copies * Fix code	2022-02-09 09:28:57 -05:00
Manuel R. Ciosici	7b83feb50a	Deprecates AdamW and adds `--optim` (#14744 ) * Add AdamW deprecation warning * Add --optim to Trainer * Update src/transformers/optimization.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/optimization.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/optimization.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/optimization.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/training_args.py * fix style * fix * Regroup adamws together Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Change --adafactor to --optim adafactor * Use Enum for optimizer values * fixup! Change --adafactor to --optim adafactor * fixup! Change --adafactor to --optim adafactor * fixup! Change --adafactor to --optim adafactor * fixup! Use Enum for optimizer values * Improved documentation for --adafactor Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Add mention of no_deprecation_warning Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Rename OptimizerOptions to OptimizerNames * Use choices for --optim * Move optimizer selection code to a function and add a unit test * Change optimizer names * Rename method Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Rename method Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Remove TODO comment Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Rename variable Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Rename variable Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Rename function * Rename variable * Parameterize the tests for supported optimizers * Refactor * Attempt to make tests pass on CircleCI * Add a test with apex * rework to add apex to parameterized; add actual train test * fix import when torch is not available * fix optim_test_params when torch is not available * fix optim_test_params when torch is not available * re-org * small re-org * fix test_fused_adam_no_apex * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Remove .value from OptimizerNames * Rename optimizer strings s\|--adam_\|--adamw_\| * Also rename Enum options * small fix * Fix instantiation of OptimizerNames. Remove redundant test * Use ExplicitEnum instead of Enum * Add unit test with string optimizer * Change optimizer default to string value Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stas Bekman <stas@stason.org>	2022-01-13 08:14:51 -08:00
Sylvain Gugger	b5e2b183af	Doc styler examples (#14953 ) * Fix bad examples * Add black formatting to style_doc * Use first nonempty line * Put it at the right place * Don't add spaces to empty lines * Better templates * Deal with triple quotes in docstrings * Result of style_doc * Enable mdx treatment and fix code examples in MDXs * Result of doc styler on doc source files * Last fixes * Break copy from	2021-12-27 19:07:46 -05:00
Stas Bekman	133c5e40c4	[doc] consistent True/False/None default format (#14951 ) * [doc] consistent True/False/None default format * Update src/transformers/models/xlnet/modeling_xlnet.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-27 14:31:40 -08:00
Sylvain Gugger	87e6e4fe5c	Doc styler v2 (#14950 ) * New doc styler * Fix issue with args at the start * Code sample fixes * Style code examples in MDX * Fix more patterns * Typo * Typo * More patterns * Do without black for now * Get more info in error * Docstring style * Re-enable check * Quality * Fix add_end_docstring decorator * Fix docstring	2021-12-27 16:31:21 -05:00
Sylvain Gugger	27b3031de2	Mass conversion of documentation from rst to Markdown (#14866 ) * Convert docstrings of all configurations and tokenizers * Processors and fixes * Last modeling files and fixes to models * Pipeline modules * Utils files * Data submodule * All the other files * Style * Missing examples * Style again * Fix copies * Say bye bye to rst docstrings forever	2021-12-21 15:06:33 -05:00
Zed	0062058399	Fix the value error typo of AdamW's betas' valid values checking (#14780 ) * Fix the value error typo of AdamW's betas value check * error fixed	2021-12-21 09:44:09 -05:00
Patrick von Platen	91f3dfbfdd	[Adafactor] Fix adafactor (#14713 ) * correct changes * add comment	2021-12-12 13:31:46 +01:00
Nishant Prabhu	225de5ccbb	Replace assert statement with if condition and ValueError (#13263 )	2021-08-25 12:14:03 -04:00
Stas Bekman	d6ea91c96a	fix pt-1.9.0 `add_` deprecation (#12217 ) * fix pt-1.9.0 add_ deprecation * add () for clarity * Trigger CI * require_version(torch	2021-06-17 08:53:59 -07:00
Stas Bekman	1ed2ebf60d	[style] consistent nn. and nn.functional (#12124 ) * consistent nn. and nn.functional * fix glitch * fix glitch #2	2021-06-14 09:44:28 -07:00
Stas Bekman	ff7c81687a	[optim] implement AdafactorSchedule (#12123 ) * implement AdafactorSchedule * typo * fix * Update src/transformers/optimization.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-14 09:43:48 -07:00
Eldar Kurtic	cf409e5594	Fix docstring typo (#11611 )	2021-05-06 17:09:28 +05:30
Josh	c301c26370	Fix Adafactor documentation (recommend correct settings) (#10526 ) * Update optimization.py Fix documentation to reflect optimal settings for Adafactor * update and expand on the recommendations * style * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * flip scale_parameter to True for the 2nd recommendatoin Co-authored-by: Stas Bekman <stas@stason.org> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-03-31 21:03:38 -07:00
Sylvain Gugger	acc3bd9d2a	Enforce string-formatting with f-strings (#10980 ) * First third * Styling and fix mistake * Quality * All the rest * Treat %s and %d * typo * Missing ) * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-03-31 10:00:27 -04:00
CeShine Lee	8672bcda1f	Adafactor: avoid updating group["lr"] attributes (#9751 ) This affects Adafactor with relative_step=False and scale_parameter=True. Updating group["lr"] makes the result of ._get_lr() depends on the previous call, i.e., on the scale of other parameters. This isn't supposed to happen.	2021-02-01 08:07:33 -05:00
Sylvain Gugger	490b39e614	Seq2seq trainer (#9241 ) * Add label smoothing in Trainer * Add options for scheduler and Adafactor in Trainer * Put Seq2SeqTrainer in the main lib * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Address review comments and adapt scripts * Documentation * Move test not using script to tests folder Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-12-22 11:33:44 -05:00
Sylvain Gugger	08f534d2da	Doc styling (#8067 ) * Important files * Styling them all * Revert "Styling them all" This reverts commit 7d029395fdae8513b8281cbc2a6c239f8093503e. * Syling them for realsies * Fix syntax error * Fix benchmark_utils * More fixes * Fix modeling auto and script * Remove new line * Fixes * More fixes * Fix more files * Style * Add FSMT * More fixes * More fixes * More fixes * More fixes * Fixes * More fixes * More fixes * Last fixes * Make sphinx happy	2020-10-26 18:26:02 -04:00
Sylvain Gugger	d2f9cb838e	Fix in Adafactor docstrings (#6845 )	2020-08-31 10:52:47 -04:00
Lysandre Debut	41aa2b4ef1	Adafactor docs (#6765 )	2020-08-27 05:16:50 -04:00
Nikolai Yakovenko	971d1802d0	Add AdaFactor optimizer from fairseq (#6722 ) * AdaFactor optimizer ported from fairseq. Tested for T5 finetuning and MLM -- reduced memory consumption compared to ADAM. * update PR fixes, add basic test * bug -- incorrect params in test * bugfix -- import Adafactor into test * bugfix -- removed accidental T5 include * resetting T5 to master * bugfix -- include Adafactor in __init__ * longer loop for adafactor test * remove double error class declare * lint * black * isort * Update src/transformers/optimization.py Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * single docstring * Cleanup docstring Co-authored-by: Nikolai Y <nikolai.yakovenko@point72.com> Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-08-27 04:58:13 -04:00
Lysandre Debut	77abd1e79f	Centralize logging (#6434 ) * Logging * Style * hf_logging > utils.logging * Address @thomwolf's comments * Update test * Update src/transformers/benchmark/benchmark_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Revert bad change Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-26 11:10:36 -04:00
Stas Bekman	39c3b1d9de	[sched] polynomial_decay_schedule use default power=1.0 (#6473 )	2020-08-17 08:33:12 -04:00
Stas Bekman	ece0903e11	lr_schedulers: add get_polynomial_decay_schedule_with_warmup (#6361 ) * [wip] add get_polynomial_decay_schedule_with_warmup * style * add assert * change lr_end to a much smaller default number * check for exact equality * [model_cards] electra-base-turkish-cased-ner (#6350) * for electra-base-turkish-cased-ner * Add metadata Co-authored-by: Julien Chaumond <chaumond@gmail.com> * Temporarily de-activate TPU CI * Update modeling_tf_utils.py (#6372) fix typo: ckeckpoint->checkpoint * the test now works again (#6371) * correct pl link in readme (#6364) * refactor almost identical tests (#6339) * refactor almost identical tests * important to add a clear assert error message * make the assert error even more descriptive than the original bt * Small docfile fixes (#6328) * Patch models (#6326) * TFAlbertFor{TokenClassification, MultipleChoice} * Patch models * BERT and TF BERT info s * Update check_repo * Ci GitHub caching (#6382) * Cache Github Actions CI * Remove useless file * Colab button (#6389) * Add colab button * Add colab link for tutorials * Fix links for open in colab (#6391) * Update src/transformers/optimization.py consistently use lr_end=1e-7 default Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * [wip] add get_polynomial_decay_schedule_with_warmup * style * add assert * change lr_end to a much smaller default number * check for exact equality * Update src/transformers/optimization.py consistently use lr_end=1e-7 default Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * remove dup (leftover from merge) * convert the test into the new refactored format * stick to using the current_step as is, without ++ Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com> Co-authored-by: Julien Chaumond <chaumond@gmail.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Alexander Measure <ameasure@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-08-11 17:56:41 -04:00
Ninnart Fuengfusin	24c5a6e351	Update optimization.py (#6261 )	2020-08-05 07:34:57 -04:00
Sylvain Gugger	4ade7491f4	Fix examples titles and optimization doc page (#5408 )	2020-07-01 08:11:25 -04:00
Cola	eacea530c1	🚨 Remove warning of deprecation (#4477 ) Remove warning of deprecated overload of addcdiv_ Fix #4451	2020-05-20 16:48:29 -04:00
Julien Chaumond	3e0f062106	Fix addcmul_	2020-05-15 17:44:17 -04:00
Julien Chaumond	fc2a4c88ce	Fix: one more try	2020-05-15 17:38:48 -04:00
Julien Chaumond	55bda52555	Same fix for `addcmul_`	2020-05-15 17:23:48 -04:00
Julien Chaumond	ad02c961c6	Fix UserWarning: This overload of add_ is deprecated in pytorch==1.5.0	2020-05-15 17:09:11 -04:00
Julien Chaumond	83a41d39b3	💄 super	2020-01-15 18:33:50 -05:00
alberduris	81d6841b4b	GPU text generation: mMoved the encoded_prompt to correct device	2020-01-06 15:11:12 +01:00
alberduris	dd4df80f0b	Moved the encoded_prompts to correct device	2020-01-06 15:11:12 +01:00
Aymeric Augustin	6be7cdda66	Move source code inside a src subdirectory. This prevents transformers from being importable simply because the CWD is the root of the git repository, while not being importable from other directories. That led to inconsistent behavior, especially in examples. Once you fetch this commit, in your dev environment, you must run: $ pip uninstall transformers $ pip install -e .	2019-12-22 14:15:13 +01:00

37 Commits