Files
HuggingFace_transformer/docs/source
Nikos Karampatziakis ca59d6f77c Offloaded KV Cache (#31325)
* Initial implementation of OffloadedCache

* enable usage via cache_implementation

* Address feedback, add tests, remove legacy methods.

* Remove flash-attn, discover synchronization bugs, fix bugs

* Prevent usage in CPU only mode

* Add a section about offloaded KV cache to the docs

* Fix typos in docs

* Clarifications and better explanation of streams
2024-08-01 14:42:07 +02:00
..
2024-06-26 21:59:08 +01:00
2024-08-01 14:42:07 +02:00
2024-07-24 17:36:32 +01:00
2024-04-16 11:58:55 +02:00
2024-07-24 17:36:32 +01:00
2024-04-23 16:06:20 +01:00
2024-06-12 11:33:00 +01:00
2023-11-08 08:35:20 -05:00