Sanchit Gandhi
a9701953ff
[whisper] static kv cache (#31166)
* make work with cache abstraction
* correct for static cache
* hacks for compile
* make fast
* fix
* fix pos ids
* generate
* fix sdpa
* fix sdpa cache pos
* fix fa2
* clean fa2
* integrate cache into generate
* make style
* copies
* more copies
* update eager
* update sdpa
* update fa2
* simplify
* use cache pos
* always compute cross-cache for debug
* avoid recompiles
Co-authored-by: Arthur Zucker <arthur@huggingface.co>
* fix fix
* fix fix fix
* more fix
* try encoder-decoder cache (too messy)
* revert encoder-decoder cache
* check cross-attn cache
* use enc-dec dataclass
* use richer enc-dec dataclass
* clean-up
* revert static cache changes
* small fixes
* revert to cpu flag
* fix copies
* add static slow test
* past k/v docstring
* more docstrings
* cache_position docstrings
* add to docs
* add enc-dec cache to docs
* make style
* fix after rebase
* fix beam
* style
* fix generation strategies
* fix most decoder-only tests
* style
* skip test
* more clean up
* small docstrings
* Apply suggestions from code review
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* add todo
* only crop self-attn
* check cache in mixin
* style
* fix re-compile after rebase
* move `is_updated` logic to enc-dec wrapper
* revert back
* revert cache back
* finalise design
* fix
* fix fix
* style
* Update src/transformers/cache_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* deprecate
* updates
* final updates
* style
* style
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-07-02 13:24:15 +01:00
..
2024-07-02 13:24:15 +01:00
2024-06-11 15:47:38 +01:00
2024-07-02 13:24:15 +01:00
2024-06-19 11:26:25 +02:00
2024-06-23 20:27:21 +01:00
2024-04-08 14:21:16 +01:00
2024-05-28 18:29:22 +02:00
2024-06-27 17:51:42 +02:00
2023-09-04 11:15:12 +01:00
2024-04-24 09:38:18 +02:00
2024-04-16 15:34:04 +01:00
2024-05-31 14:16:23 +02:00
2024-02-08 14:13:35 -08:00
2024-02-16 08:16:58 +01:00
2024-02-16 08:16:58 +01:00
2023-06-20 18:07:47 -04:00
2024-04-01 18:47:32 -07:00
2024-06-18 14:16:30 +01:00
2024-02-16 08:16:58 +01:00
2024-04-25 19:38:48 +01:00
2024-06-06 22:02:38 +01:00
2024-02-12 10:48:31 -08:00
2024-02-02 08:45:00 +01:00
2024-04-23 09:04:17 -07:00
2023-06-20 18:07:47 -04:00
2023-12-20 10:37:23 -08:00
2024-05-23 17:25:20 +05:00
2024-06-03 14:55:10 +01:00
2024-02-16 08:16:58 +01:00
2023-11-13 14:20:54 +01:00
2024-06-27 17:36:19 +02:00
2024-05-29 11:55:43 +01:00
2024-05-20 16:27:24 +02:00
2023-12-09 05:38:14 +09:00
2024-04-30 18:14:12 +01:00
2024-04-18 12:49:43 -04:00
2024-02-16 08:16:58 +01:00
2024-03-23 18:29:39 -07:00
2024-02-16 08:16:58 +01:00
2023-12-08 10:32:18 -08:00
2024-05-30 16:47:35 +02:00
2024-02-16 08:16:58 +01:00
2024-02-02 08:45:00 +01:00
2024-06-27 17:51:42 +02:00
2024-06-10 09:53:25 +01:00
2024-02-16 08:16:58 +01:00
2024-02-16 08:16:58 +01:00
2024-06-18 11:00:26 -07:00
2024-03-12 10:39:56 +00:00
2024-02-16 08:16:58 +01:00
2023-06-20 18:07:47 -04:00
2023-10-31 09:44:51 -07:00
2024-02-16 08:16:58 +01:00
2023-11-06 19:45:03 +00:00
2024-04-16 11:58:55 +02:00
2024-02-16 08:16:58 +01:00
2024-02-02 08:45:00 +01:00
2024-02-16 08:16:58 +01:00
2024-06-12 11:33:00 +01:00
2024-04-29 10:57:51 +01:00
2023-11-06 19:45:03 +00:00
2024-02-16 08:16:58 +01:00
2024-04-16 11:58:55 +02:00
2024-02-26 08:18:15 -08:00
2024-06-26 21:59:08 +01:00
2024-02-16 08:16:58 +01:00
2024-02-16 08:16:58 +01:00
2024-06-03 16:52:23 -07:00
2024-02-16 08:16:58 +01:00
2024-05-21 10:16:37 +02:00
2024-05-14 18:45:06 +01:00
2024-02-16 08:16:58 +01:00