Stas Bekman
580dd87c55
[Deepspeed] add support for bf16 mode (#14569)
* [WIP] add support for bf16 mode
* prep for bf16
* prep for bf16
* fix; zero2/bf16 is ok
* check bf16 is available
* test fixes
* enable zero3_bf16
* config files
* docs
* split stage_dtype; merge back to non-dtype-specific config file
* fix doc
* cleanup
* cleanup
* bfloat16 => bf16 to match the PR changes
* s/zero_gather_fp16_weights_on_model_save/zero_gather_16bit_weights_on_model_save/; s/save_fp16_model/save_16bit_model/
* test fixes/skipping
* move
* fix
* Update docs/source/main_classes/deepspeed.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* backticks
* cleanup
* cleanup
* cleanup
* new version
* add note about grad accum in bf16
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-11 17:53:53 -08:00
..
2022-03-04 18:18:34 +01:00
2022-03-11 17:53:53 -08:00
2022-03-10 11:34:44 +01:00
2022-03-11 16:43:49 -06:00
2022-03-11 10:09:15 -05:00
2022-03-11 16:43:49 -06:00
2022-01-27 13:49:04 -06:00
2022-02-22 09:57:28 +01:00
2021-12-27 19:07:46 -05:00
2022-02-01 18:31:35 -06:00
2022-03-01 11:26:20 +01:00
2021-12-22 16:14:35 -05:00
2022-01-31 17:03:06 +01:00
2020-06-17 14:01:10 -04:00
2021-12-22 16:14:35 -05:00
2022-02-07 12:34:56 -06:00
2022-01-25 17:26:17 -05:00
2022-02-23 11:40:06 -05:00
2022-02-15 08:48:00 -08:00
2021-12-22 16:14:35 -05:00
2021-12-27 19:07:46 -05:00
2022-03-10 11:34:44 +01:00
2022-01-28 19:01:37 -06:00
2021-12-27 19:07:46 -05:00
2022-01-28 18:49:26 -06:00
2022-01-14 10:12:30 -05:00
2022-03-01 15:10:31 -06:00
2020-04-06 14:32:39 -04:00
2022-02-04 11:15:13 -08:00
2022-02-09 15:27:30 +01:00
2021-12-27 19:07:46 -05:00
2021-12-22 16:14:35 -05:00
2022-02-01 18:31:35 -06:00
2022-01-28 07:52:01 -05:00
2022-02-01 18:31:35 -06:00
2022-03-09 13:09:56 +01:00
2022-03-07 13:29:14 -06:00
2021-12-22 16:14:35 -05:00
2022-03-09 17:36:59 +01:00
2022-02-25 17:46:38 +01:00
2022-01-10 08:44:33 -08:00
2022-02-11 16:51:30 -05:00
2022-02-01 18:28:12 -06:00
2022-03-11 13:05:44 -06:00