Files
HuggingFace_transformer/tests
efsotr 3ee72af6b6 Fix graph break in torch.compile when using FA2 with attention_mask=None and batch size > 1 (#37332)
* Fix graph break in torch.compile when using FA2 with attention_mask=None and batch size > 1

* fix code format

* add test; replace position_ids with query_states becasue position_ids.shape[0] is always 1

* add assert loss is not nan
2025-06-25 07:58:34 +00:00
..
2025-06-23 10:56:51 +02:00
2025-06-23 10:56:51 +02:00
2025-06-13 11:07:09 +00:00
2025-06-11 17:28:06 +01:00
2025-06-23 10:56:51 +02:00
2025-06-24 19:43:40 +02:00