Stas Bekman
2df34f4aba
[trainer] deepspeed integration (#9211)
* deepspeed integration
* style
* add test
* ds wants to do its own backward
* fp16 assert
* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* style
* for clarity extract what args are being passed to deepspeed
* introduce the concept of self.wrapped_model
* s/self.wrapped_model/self.model_wrapped/
* complete transition to self.wrapped_model / self.model
* fix
* doc
* give ds its own init
* add custom overrides, handle bs correctly
* fix test
* clean up model_init logic, fix small bug
* complete fix
* collapse --deepspeed_config into --deepspeed
* style
* start adding doc notes
* style
* implement hf2ds optimizer and scheduler configuration remapping
* oops
* call get_num_training_steps absolutely when needed
* workaround broken auto-formatter
* deepspeed_config arg is no longer needed - fixed in deepspeed master
* use hf's fp16 args in config
* clean
* start on the docs
* rebase cleanup
* finish up --fp16
* clarify the supported stages
* big refactor thanks to discovering deepspeed.init_distributed
* cleanup
* revert fp16 part
* add checkpoint-support
* more init ds into integrations
* extend docs
* cleanup
* unfix docs
* clean up old code
* imports
* move docs
* fix logic
* make it clear which file it's referring to
* document nodes/gpus
* style
* wrong format
* style
* deepspeed handles gradient clipping
* easier to read
* major doc rewrite
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* docs
* switch to AdamW optimizer
* style
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* clarify doc
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-01-12 19:05:18 -08:00
..
2020-12-17 11:28:38 -05:00
2020-07-07 16:04:15 -06:00
2021-01-06 17:11:42 +01:00
2021-01-12 19:05:18 -08:00
2021-01-12 09:26:32 -05:00
2021-01-11 08:53:41 -05:00
2021-01-05 06:18:48 -05:00
2021-01-05 06:18:48 -05:00
2020-06-17 14:01:10 -04:00
2021-01-05 06:18:48 -05:00
2021-01-04 17:27:29 +01:00
2020-05-27 11:36:55 -04:00
2020-02-25 13:48:24 -05:00
2020-12-23 10:15:49 -05:00
2021-01-12 02:06:32 +01:00
2020-12-07 18:36:34 -05:00
2020-12-07 18:36:34 -05:00
2021-01-07 11:51:02 +01:00
2020-12-23 10:15:49 -05:00
2021-01-05 06:18:48 -05:00
2020-04-06 14:32:39 -04:00
2020-12-07 18:36:34 -05:00
2020-12-23 10:15:49 -05:00
2020-12-23 10:15:49 -05:00
2020-12-07 18:36:34 -05:00
2020-12-23 10:15:49 -05:00
2020-12-07 18:36:34 -05:00
2021-01-05 06:18:48 -05:00
2021-01-05 06:18:48 -05:00
2020-12-23 10:15:49 -05:00
2021-01-12 19:05:18 -08:00