[trainer] add tf32-mode control (#14606)
* [trainer] add --tf32 support * it's pt>=.17 * it's pt>=.17 * flip the default to True * add experimental note * simplify logic * style * switch to 3-state logic * doc * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * re-style code Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
This commit is contained in:
@@ -358,8 +358,13 @@ Like all cases with reduced precision this may or may not be satisfactory for yo
|
||||
|
||||
If you're already using fp16 or bf16 mixed precision it may help with the throughput as well.
|
||||
|
||||
You can enable this mode in the 🤗 Trainer with `--tf32`, or disable it with `--tf32 0` or `--no_tf32`.
|
||||
By default the PyTorch default is used.
|
||||
|
||||
Note: tf32 mode is internal to CUDA and can't be accessed directly via `tensor.to(dtype=torch.tf32)` as `torch.tf32` doesn't exit.
|
||||
|
||||
Note: you need `torch>=1.7` to enjoy this feature.
|
||||
|
||||
|
||||
### Gradient Checkpointing
|
||||
|
||||
|
||||
Reference in New Issue
Block a user