Anton Vlasjuk
b275a41005
[GPT2] Add SDPA support (#31172)
* `gpt2` sdpa support
* fix (at least) one test, style, repo consistency
* fix sdpa mask in forward --> fixes generation
* test
* test2
* test3
* test4
* simplify shapes for attn mask creation and small comments
* hub fail test
* benchmarks
* flash attn 2 mask should not be inverted on enc-dec setup
* fix comment
* apply some suggestion from code review
- only save _attn_implentation once
- remove unnecessary comment
* change elif logic
* [run-slow] gpt2
* modify `test_gpt2_sample_max_time` to follow previous assertion patterns
2024-06-19 09:40:57 +02:00
..
2024-06-18 11:55:36 +02:00
2022-02-23 15:46:28 -05:00
2023-10-09 11:04:57 +02:00
2024-06-17 17:29:13 +01:00
2024-05-13 18:14:36 +02:00
2024-03-19 14:43:02 +00:00
2024-04-22 13:15:28 +01:00
2024-06-18 14:07:16 +01:00
2024-06-19 09:40:57 +02:00
2024-05-30 15:25:43 +01:00
2024-02-29 03:56:16 +01:00
2024-06-17 17:29:13 +01:00
2024-06-17 17:56:51 +01:00
2023-12-07 10:00:08 +01:00
2024-06-17 17:29:13 +01:00
2024-05-22 06:40:15 +02:00
2024-06-17 17:29:13 +01:00
2024-06-17 17:29:13 +01:00
2020-01-06 15:11:12 +01:00
2023-12-20 18:33:17 +00:00
2024-03-06 10:57:04 +00:00
2023-11-15 14:10:39 +01:00
2024-05-29 11:55:43 +01:00
2023-06-15 07:30:24 -04:00
2024-03-15 14:18:41 +00:00
2024-06-11 15:47:38 +01:00
2024-03-15 14:18:41 +00:00
2024-05-21 13:56:52 +01:00
2024-06-07 19:40:26 +01:00
2024-05-16 10:56:11 +01:00
2024-01-23 10:28:23 +01:00
2024-05-13 15:59:46 +01:00
2024-03-21 14:04:11 +00:00
2024-06-12 14:10:32 +02:00
2024-06-07 17:50:18 +01:00
2024-06-13 16:27:16 +02:00
2023-09-05 10:12:25 +02:00
2024-05-24 08:38:58 -07:00
2024-06-03 10:53:15 +02:00