[WIP] Enable reproducibility for distributed trainings (#16907)
* add seed worker and set_deterministic_seed_for_cuda function to enforce reproducability * change function name to enable determinism, add docstrings, reproducability support for tf * change function name to enable_determinism_for_distributed_training * revert changes in set_seed and call set_seed within enable_full_determinism * add one position argument for seed_worker function * add full_determinism flag in training args and call enable_full_determinism when it is true * add enable_full_determinism to documentation * apply make fixup after the last commit * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
This commit is contained in:
committed by
GitHub
parent
5229744b26
commit
c33f6046c3
@@ -22,6 +22,8 @@ Most of those are only useful if you are studying the code of the Trainer in the
|
||||
|
||||
[[autodoc]] IntervalStrategy
|
||||
|
||||
[[autodoc]] enable_full_determinism
|
||||
|
||||
[[autodoc]] set_seed
|
||||
|
||||
[[autodoc]] torch_distributed_zero_first
|
||||
|
||||
Reference in New Issue
Block a user