HuggingFace_transformer

Files

Zach Mueller d9f733625c Enable Gradient Accumulation fix across all models + trainer fully in forward() (#34283 )

* Enable grad accum fix across all models + trainer fully in forward()

* handle peft case

* Account for DDP: need to run scale tests

* Use accelerator state

* Quality

* Guard

* Experiment w/ only fairseq fix

* Fairseq only

* Revert multiply_grads fix

* Mult by grad accum to fully bring back solution

* Style

* Good to go now

* Skip fx tests for now

* Bookmark

* Working now

2024-10-23 11:24:57 -04:00

__init__.py

Cohere Model Release (#29622 )

2024-03-15 14:29:11 +01:00

test_modeling_cohere.py

Enable Gradient Accumulation fix across all models + trainer fully in forward() (#34283 )

2024-10-23 11:24:57 -04:00

test_tokenization_cohere.py

Skip tests properly (#31308 )

2024-06-26 21:59:08 +01:00