Refactoring Trainer, adds save_only_model arg and simplifying FSDP integration (#27652)

* add code changes

1. Refactor FSDP
2. Add `--save_only_model` option: When checkpointing, whether to only save the model, or also the optimizer, scheduler & rng state.
3. Bump up the minimum `accelerate` version to `0.21.0`

* quality

* fix quality?

* Revert "fix quality?"

This reverts commit 149330a6abc078827be274db84c8a2d26a76eba1.

* fix fsdp doc strings

* fix quality

* Update src/transformers/training_args.py

Co-authored-by: Zach Mueller <muellerzr@gmail.com>

* please fix the quality issue 😅

* Apply suggestions from code review

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* address comment

* simplify conditional check as per the comment

* update documentation

---------

Co-authored-by: Zach Mueller <muellerzr@gmail.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
This commit is contained in:
Sourab Mangrulkar
2023-11-24 11:40:52 +05:30
committed by GitHub
parent b8db265bc6
commit a761d6e9a0
8 changed files with 163 additions and 173 deletions

View File

@@ -96,7 +96,7 @@ if stale_egg_info.exists():
# 2. once modified, run: `make deps_table_update` to update src/transformers/dependency_versions_table.py
_deps = [
"Pillow>=10.0.1,<=15.0",
"accelerate>=0.20.3",
"accelerate>=0.21.0",
"av==9.2.0", # Latest version of PyAV (10.0.0) has issues with audio stream.
"beautifulsoup4",
"codecarbon==1.2.0",