HuggingFace_transformer/tests/utils/test_modeling_utils.py at e55983e2b90e10f5f82d5ebbd5a6735f002dbebb

Files

Peter St. John bab40c6838 [core] support tensor-valued _extra_state values in from_pretrained (#38155 )

Support tensor-valued _extra_state values

TransformerEngine uses the pytorch get/set_extra_state API to store FP8
layer config information as bytes Tensor in the _extra_state entry in
the state dict. With recent changes to from_pretrained, this
functionality has broken and loading a model that uses this API doesn't
appear to work. This PR fixes the save/load pretrained functions for
extra state entries that use a pytorch tensor, and adds a (currently
x-failing) test for a dictionary extra state.

Signed-off-by: Peter St. John <pstjohn@nvidia.com>

2025-05-28 15:38:42 +02:00

131 KiB

Raw Blame History

View Raw

131 KiB Raw Blame History

131 KiB

Raw Blame History