Anton Vlasjuk
b275a41005
[GPT2] Add SDPA support (#31172)
* `gpt2` sdpa support
* fix (at least) one test, style, repo consistency
* fix sdpa mask in forward --> fixes generation
* test
* test2
* test3
* test4
* simplify shapes for attn mask creation and small comments
* hub fail test
* benchmarks
* flash attn 2 mask should not be inverted on enc-dec setup
* fix comment
* apply some suggestion from code review
- only save _attn_implentation once
- remove unnecessary comment
* change elif logic
* [run-slow] gpt2
* modify `test_gpt2_sample_max_time` to follow previous assertion patterns
2024-06-19 09:40:57 +02:00
..
2024-05-30 16:47:35 +02:00
2024-06-19 09:40:57 +02:00
2024-06-03 16:52:23 -07:00
2024-06-13 17:48:54 +02:00
2024-04-16 11:58:55 +02:00
2024-05-30 16:47:35 +02:00
2024-06-12 11:33:00 +01:00
2024-06-12 11:33:00 +01:00
2024-04-23 16:06:20 +01:00
2024-05-29 11:55:43 +01:00
2024-06-12 11:33:00 +01:00
2023-11-08 08:35:20 -05:00
2024-06-12 11:33:00 +01:00
2024-04-08 14:21:16 +01:00