Raushan Turganbay
8d6259b0b8
[refactor] set attention implementation (#38974)
* update
* fix some tests
* init from config, changes it in-place, add deepcopy in tests
* fix modernbert
* don't delete thsi config attr
* update
* style and copies
* skip tests in generation
* fix style
* accidentally removed flash-attn-3, revert
* docs
* forgot about flags set to False
* fix copies
* address a few comments
* fix copies
* custom code BC
2025-07-15 09:34:06 +02:00
..
2022-02-23 15:46:28 -05:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 16:44:10 +01:00
2025-05-19 18:02:06 +01:00
2025-06-25 14:39:27 +02:00
2025-05-09 20:52:41 +00:00
2025-06-26 11:04:23 +00:00
2025-06-06 09:29:51 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-07-15 09:34:06 +02:00