Files
HuggingFace_transformer/src/transformers
Sylvain Gugger b70f441b72 Smp grad accum (#10488)
* Fix gradient accumulation for SM Model Parallelism

* Style and divide loss by grad accum steps
2021-03-03 12:13:29 -05:00
..
2021-03-03 12:13:29 -05:00
2020-12-07 18:36:34 -05:00
2021-02-17 09:53:43 -05:00
2020-12-07 18:36:34 -05:00
2021-03-03 12:13:29 -05:00
2021-03-03 12:13:29 -05:00