Sanchit Gandhi
a9701953ff
[whisper] static kv cache (#31166)
* make work with cache abstraction
* correct for static cache
* hacks for compile
* make fast
* fix
* fix pos ids
* generate
* fix sdpa
* fix sdpa cache pos
* fix fa2
* clean fa2
* integrate cache into generate
* make style
* copies
* more copies
* update eager
* update sdpa
* update fa2
* simplify
* use cache pos
* always compute cross-cache for debug
* avoid recompiles
Co-authored-by: Arthur Zucker <arthur@huggingface.co>
* fix fix
* fix fix fix
* more fix
* try encoder-decoder cache (too messy)
* revert encoder-decoder cache
* check cross-attn cache
* use enc-dec dataclass
* use richer enc-dec dataclass
* clean-up
* revert static cache changes
* small fixes
* revert to cpu flag
* fix copies
* add static slow test
* past k/v docstring
* more docstrings
* cache_position docstrings
* add to docs
* add enc-dec cache to docs
* make style
* fix after rebase
* fix beam
* style
* fix generation strategies
* fix most decoder-only tests
* style
* skip test
* more clean up
* small docstrings
* Apply suggestions from code review
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* add todo
* only crop self-attn
* check cache in mixin
* style
* fix re-compile after rebase
* move `is_updated` logic to enc-dec wrapper
* revert back
* revert cache back
* finalise design
* fix
* fix fix
* style
* Update src/transformers/cache_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* deprecate
* updates
* final updates
* style
* style
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-07-02 13:24:15 +01:00
..
2024-06-18 11:55:36 +02:00
2022-02-23 15:46:28 -05:00
2023-10-09 11:04:57 +02:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-03-19 14:43:02 +00:00
2024-06-26 14:50:08 +01:00
2024-07-02 13:24:15 +01:00
2024-07-02 13:24:15 +01:00
2024-05-30 15:25:43 +01:00
2024-02-29 03:56:16 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2023-12-07 10:00:08 +01:00
2024-06-17 17:29:13 +01:00
2024-06-26 21:59:08 +01:00
2024-06-28 13:50:27 +01:00
2024-07-02 13:46:03 +02:00
2020-01-06 15:11:12 +01:00
2023-12-20 18:33:17 +00:00
2024-07-02 13:46:03 +02:00
2023-06-15 07:30:24 -04:00
2024-06-26 21:59:08 +01:00
2024-05-21 13:56:52 +01:00
2024-07-02 13:46:03 +02:00
2024-05-16 10:56:11 +01:00
2024-05-13 15:59:46 +01:00
2024-06-26 21:59:08 +01:00
2024-06-13 16:27:16 +02:00
2023-09-05 10:12:25 +02:00
2024-06-26 21:59:08 +01:00