New features for CodeParrot training script (#16851)

* add tflops logging and fix grad accumulation

* add accelerate tracking and checkpointing

* scale loss of last batch correctly

* fix typo

* compress loss computation

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* add resume from checkpoint argument

* add load_state accelerate from checkpoint, register lr scheduler and add tflops function

* reformat code

* reformat code

* add condition on path for resume checkpoint

* combine if conditions

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* add source for tflops formula

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
This commit is contained in:
Loubna Ben Allal
2022-04-21 18:43:46 +02:00
committed by GitHub
parent eef2422e96
commit d91841315a
3 changed files with 69 additions and 19 deletions

View File

@@ -49,6 +49,10 @@ class TrainingArguments:
default=1024,
metadata={"help": "Interval to save checkpoints. Measured as number of forward passes not training steps."},
)
resume_from_checkpoint: Optional[str] = field(
default=None,
metadata={"help": "States path if the training should continue from a checkpoint folder."},
)
@dataclass