[debug] DebugUnderflowOverflow doesn't work with DP (#12816)

This commit is contained in:
Stas Bekman
2021-07-21 09:36:02 -07:00
committed by GitHub
parent ac3cb660ca
commit cf0755aa6e
3 changed files with 15 additions and 4 deletions

View File

@@ -24,7 +24,11 @@ Underflow and Overflow Detection
.. note::
This feature can be used with any ``nn.Module``-based model
For multi-GPU training it requires DDP (``torch.distributed.launch``).
.. note::
This feature can be used with any ``nn.Module``-based model.
If you start getting ``loss=NaN`` or the model inhibits some other abnormal behavior due to ``inf`` or ``nan`` in
activations or weights one needs to discover where the first underflow or overflow happens and what led to it. Luckily