Anton Vlasjuk
d95c864a25
🔴 🔴 🔴 [Attention] Refactor Attention Interface for Bart-based Models ( #38108 )
...
* starting attn refactor for encoder decoder models via bart (eager + sdpa)
* flash attention works, remove unnecessary code
* flex attention support for bart!, gotta check if the renaming is not too aggressive
* some comments
* skip flex grad test for standalone as done with the other test
* revert flex attn rename (for now), sdpa simplify, and todos
* more todos
* refactor mask creation for reuse
* modular attempt at biogpt
* first batch of other models
* fix attn dropout
* fix autoformer copies
* hubert
* another batch of models
* copies/style + last round of bart models --> whisper next?
* remove unnecessary _reshape function and remove copy to whisper
* add skip for decoder-only models out of enc-dec (same as in bart)
* bring back licences
* remove comment, added to pr read instead
* mostly docs
* disable sew flex attn as it's unclear attn mask for now
* oops
* test fixes for enc-dec
* torch fx fixes + try at flex attn
* skip on mbart
* some more fixes
* musicgen skip / delete old attn class logic + sdpa compose compile skip
* disable flex attn for musicgen, not worth the effort
* more fixes and style
* flex attention test for dropout and encoder decoder that dont have main input names
* informer fixes
* the weirdest thing I've encountered yet...
* style
* remove empty tensor attempt, found core root in previous commits
* disable time series due to tests being very text centric on inputs
* add speech to text to be ignoring the other attns, also due to tests
* update docs
* remaining issues resolved ?
* update docs for current state --> nllb moe and pegasus x sdpa is questionable :D
* some models have not set the is_causal flag...
* change dtype in softmax tol old behaviour + some modular fixes
* I hate it but it is what it is
* fixes from main for bart
* forgot this one
* some model fixes
* style
* current status
* marian works now
* fixing some copies
* some copy fixes + time series x informer
* last models possibly and fixes on style/copies
* some post merge fixes
* more fixes
* make attention interface callable and move warnings there
* style lol
* add comment to "unsupported"
* remove callable interface and change interface warnings + some copies
* fix
* ternary is ugly af, make it simpler
* how did that happen
* fix flex attn test
* failing the test
* no more fallback! fixing copies next
* style + attn fixed
* fixing copies and mask creation
* wrong copy
* fixup tests and disable flex attn for now
* fixup last tests?
2025-05-22 17:12:58 +02:00
..
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 17:15:37 +01:00
2025-05-20 14:37:55 +02:00
2025-04-08 14:12:08 +02:00
2025-05-12 11:55:51 +02:00
2025-04-25 16:57:09 +02:00
2025-05-16 18:37:27 +01:00
2025-05-20 17:13:59 +02:00
2025-04-23 11:37:15 +02:00
2025-05-22 17:12:58 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-05-06 17:40:28 -04:00
2025-04-25 13:47:25 +01:00
2025-04-28 11:39:11 +01:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-28 11:39:11 +01:00
2025-05-16 13:26:54 +02:00
2025-05-16 13:26:54 +02:00
2025-04-14 17:07:48 +02:00
2025-04-28 15:08:46 +02:00
2025-05-22 17:12:58 +02:00
2025-05-22 17:12:58 +02:00
2025-04-08 17:15:37 +01:00
2025-05-08 18:18:54 +02:00
2025-04-08 14:12:08 +02:00
2025-04-28 14:20:45 +01:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-16 20:49:20 +02:00
2025-04-15 18:31:20 +02:00
2025-04-08 14:12:08 +02:00
2025-04-16 18:15:22 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-05-20 10:09:01 +02:00
2025-05-16 18:37:27 +01:00
2025-05-20 10:09:01 +02:00
2025-04-15 18:33:34 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-05-15 09:46:29 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-05-05 13:05:46 +01:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-05-22 17:12:58 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-05-16 18:37:27 +01:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-14 16:24:01 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-16 21:59:24 +02:00
2025-04-25 13:47:25 +01:00
2025-05-07 17:47:51 +02:00
2025-04-08 14:12:08 +02:00
2025-05-22 17:12:58 +02:00
2025-04-10 20:54:21 +02:00
2025-05-16 14:11:56 +01:00
2025-04-22 11:07:34 +01:00
2025-05-21 11:25:26 +02:00
2025-04-16 11:01:04 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-28 11:39:11 +01:00
2025-05-01 15:21:55 +02:00
2025-04-08 14:12:08 +02:00
2025-05-01 14:58:58 +01:00
2025-04-08 14:12:08 +02:00
2025-05-07 17:47:51 +02:00
2025-05-22 11:38:26 +02:00
2025-05-22 11:38:26 +02:00
2025-05-22 11:38:26 +02:00
2025-04-08 17:15:37 +01:00
2025-04-22 11:07:34 +01:00
2025-04-09 14:02:04 +02:00
2025-04-08 14:12:08 +02:00
2025-05-07 17:47:51 +02:00
2025-05-12 16:51:21 +02:00
2025-05-15 10:44:19 +02:00
2025-04-28 11:39:11 +01:00
2025-04-10 20:54:21 +02:00
2025-04-10 20:54:21 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-22 17:55:02 +02:00
2025-05-14 08:58:40 +00:00
2025-04-22 17:55:02 +02:00
2025-05-06 06:47:43 +02:00
2025-04-22 17:55:02 +02:00
2025-04-28 11:39:11 +01:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-29 12:17:55 +01:00
2025-04-08 14:12:08 +02:00
2025-05-22 17:12:58 +02:00
2025-04-08 14:12:08 +02:00
2025-04-22 11:12:18 +01:00
2025-05-20 10:09:01 +02:00
2025-04-24 11:48:11 +02:00
2025-04-08 14:12:08 +02:00
2025-04-22 11:07:34 +01:00
2025-05-16 13:26:54 +02:00
2025-05-08 18:18:54 +02:00
2025-05-12 11:55:51 +02:00
2025-05-12 11:55:51 +02:00
2025-04-08 14:12:08 +02:00
2025-04-22 11:12:18 +01:00
2025-04-22 11:07:34 +01:00
2025-05-08 18:18:54 +02:00
2025-04-08 14:12:08 +02:00
2025-05-01 15:21:55 +02:00
2025-05-01 15:21:55 +02:00
2025-05-01 15:21:55 +02:00
2025-04-08 14:12:08 +02:00
2025-04-14 17:07:36 +02:00
2025-04-08 14:12:08 +02:00
2025-04-18 10:09:19 +02:00
2025-05-20 13:15:54 +00:00
2025-05-07 17:47:51 +02:00
2025-05-13 13:49:09 +00:00
2025-05-14 10:24:07 +00:00
2025-05-21 11:50:46 +02:00
2025-04-28 11:39:11 +01:00
2025-04-08 14:12:08 +02:00
2025-04-28 11:39:11 +01:00
2025-04-08 14:12:08 +02:00
2025-05-22 17:12:58 +02:00
2025-04-08 14:12:08 +02:00
2025-05-20 13:54:04 +00:00
2025-05-22 17:12:58 +02:00
2025-05-01 15:21:55 +02:00
2025-04-08 14:12:08 +02:00
2025-04-28 11:39:11 +01:00
2025-05-22 17:12:58 +02:00
2025-03-28 15:09:35 +01:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-22 11:07:34 +01:00
2025-05-12 11:55:51 +02:00
2025-04-22 11:07:34 +01:00
2025-04-15 11:33:09 +01:00
2025-05-07 17:47:51 +02:00
2025-04-28 11:39:11 +01:00
2025-04-08 14:12:08 +02:00
2025-04-23 15:55:41 -04:00
2025-04-14 17:08:47 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-25 18:04:56 +02:00
2025-04-08 14:12:08 +02:00
2025-05-12 16:59:00 +02:00
2025-05-22 17:12:58 +02:00
2025-05-22 17:12:58 +02:00
2025-04-28 11:39:11 +01:00
2025-04-08 17:15:37 +01:00
2025-05-16 18:37:27 +01:00
2025-03-28 15:09:35 +01:00
2025-04-28 11:39:11 +01:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-05-19 15:35:23 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 17:15:37 +01:00
2025-05-12 16:02:41 +02:00
2025-04-08 14:12:08 +02:00
2025-05-15 10:44:19 +02:00
2025-04-08 17:15:37 +01:00
2025-04-14 17:58:09 +02:00
2025-05-07 17:47:51 +02:00
2025-05-22 11:38:26 +02:00
2025-05-22 17:12:58 +02:00
2025-04-25 16:57:09 +02:00
2025-05-22 17:12:58 +02:00
2025-05-22 17:12:58 +02:00
2025-04-14 13:49:13 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-05-20 10:09:01 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-28 11:39:11 +01:00
2025-05-16 12:01:46 -04:00
2025-05-22 17:12:58 +02:00
2025-04-23 15:55:33 -04:00
2025-04-28 11:39:11 +01:00
2025-04-08 14:12:08 +02:00
2025-04-28 11:39:11 +01:00
2025-04-23 15:55:20 -04:00
2025-04-08 14:12:08 +02:00
2025-05-07 09:13:08 +02:00
2025-05-14 10:24:07 +00:00
2025-05-21 09:50:39 +00:00
2025-04-11 13:32:19 +02:00
2025-04-28 11:39:11 +01:00
2025-05-21 09:50:39 +00:00
2025-05-07 09:13:08 +02:00
2025-04-28 11:39:11 +01:00
2025-04-28 11:39:11 +01:00
2025-04-22 12:21:16 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-25 13:47:25 +01:00
2025-04-08 14:12:08 +02:00
2025-04-28 11:39:11 +01:00
2025-04-25 13:47:25 +01:00
2025-04-10 20:54:21 +02:00
2025-04-10 20:54:21 +02:00
2025-04-08 14:12:08 +02:00
2025-04-16 11:23:56 +02:00
2025-04-08 14:12:08 +02:00
2025-04-28 11:39:11 +01:00
2025-05-19 09:21:14 -07:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-05-22 17:12:58 +02:00
2025-04-08 14:12:08 +02:00
2025-04-10 14:42:32 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-05-13 12:50:43 +00:00
2025-05-22 17:12:58 +02:00
2025-05-06 14:49:00 +01:00
2025-04-18 11:35:46 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-22 11:07:34 +01:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-05-07 12:20:16 -04:00
2025-04-08 14:12:08 +02:00
2025-04-10 16:58:57 +02:00
2025-04-28 11:39:11 +01:00
2025-04-08 14:12:08 +02:00
2025-05-01 15:21:55 +02:00
2025-04-08 14:12:08 +02:00
2025-04-25 16:57:09 +02:00
2025-04-17 11:26:03 +02:00
2025-04-08 14:12:08 +02:00
2025-04-17 15:39:44 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 17:15:37 +01:00
2025-04-28 14:20:45 +01:00
2025-05-01 15:21:55 +02:00
2025-04-28 11:39:11 +01:00
2025-05-22 17:12:58 +02:00
2025-05-22 17:12:58 +02:00
2025-03-21 10:20:05 +01:00
2025-04-08 14:12:08 +02:00
2025-05-14 10:24:07 +00:00
2025-04-25 16:57:09 +02:00
2025-05-13 15:40:53 +00:00
2025-05-07 17:47:51 +02:00
2025-05-22 17:12:58 +02:00
2025-04-08 17:15:37 +01:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-05-16 12:01:46 -04:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-05-22 17:12:58 +02:00
2025-04-08 17:15:37 +01:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-01-24 16:55:28 +01:00
2025-04-08 14:12:08 +02:00
2025-05-22 10:07:11 +01:00
2025-04-08 14:12:08 +02:00
2025-04-28 11:39:11 +01:00
2025-04-08 14:12:08 +02:00
2025-04-28 11:39:11 +01:00
2025-04-10 20:54:21 +02:00
2025-04-08 14:12:08 +02:00
2025-04-10 20:54:21 +02:00
2025-04-15 14:23:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-08 14:12:08 +02:00
2025-04-28 11:39:11 +01:00
2025-04-08 14:12:08 +02:00
2022-05-03 14:42:02 +02:00