Files
HuggingFace_transformer/docs/source/en/model_doc
Younes Belkada ae9a344cce [Mistral] Add Flash Attention-2 support for mistral (#26464)
* add FA-2 support for mistral

* fixup

* add sliding windows

* fixing few nits

* v1 slicing cache - logits do not match

* add comment

* fix bugs

* more mem efficient

* add warning once

* add warning once

* oops

* fixup

* more comments

* copy

* add safety checker

* fixup

* Update src/transformers/models/mistral/modeling_mistral.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* copied from

* up

* raise when padding side is right

* fixup

* add doc + few minor changes

* fixup

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-10-03 13:44:46 +02:00
..
2023-09-22 19:53:55 +03:00
2023-08-03 14:08:39 +01:00
2023-06-20 18:07:47 -04:00
2023-08-17 12:08:11 +02:00
2023-07-27 18:24:56 +01:00
2023-07-13 11:46:54 -04:00
2023-09-14 18:02:37 +01:00
2023-09-01 20:40:40 +02:00
2023-06-20 18:07:47 -04:00
2023-06-20 18:07:47 -04:00
2023-07-18 15:34:06 +01:00
2023-06-20 18:07:47 -04:00
2023-06-20 18:07:47 -04:00
2023-06-20 18:07:47 -04:00
2023-06-20 18:07:47 -04:00
2023-06-20 18:07:47 -04:00
2023-06-26 11:23:57 +02:00
2023-06-20 18:07:47 -04:00
2023-09-05 10:50:08 -07:00
2023-07-13 11:46:54 -04:00
2023-09-04 11:53:41 +01:00
2023-06-20 18:07:47 -04:00
2023-06-20 18:07:47 -04:00
2023-06-20 17:34:20 -07:00
2023-09-26 07:06:04 +02:00
2023-06-20 18:07:47 -04:00
2023-07-24 15:34:19 +01:00
2023-06-20 18:07:47 -04:00
2023-07-13 11:46:54 -04:00
2023-06-20 18:07:47 -04:00
2023-06-20 18:07:47 -04:00
2023-07-13 11:46:54 -04:00
2023-06-20 18:07:47 -04:00
2023-07-13 11:46:54 -04:00
2023-06-20 18:07:47 -04:00
2023-08-29 10:03:52 +01:00
2023-09-26 07:06:38 +02:00
2023-07-11 14:04:04 +01:00
2023-06-20 18:07:47 -04:00