Raushan Turganbay
a30c865f99
Cache: new Cache format in decoder-only models (#31421)
* draft bart with new cache
* add cache for decoder-only models
* revert utils
* modify docstring
* revert bart
* minor fixes
* fix copies (not related)
* revert tests
* remove enc-dec related code
* remove bloom
* remove opt (enc-dec)
* update docstring
* git, codegen, gpt_neo, gpt_neox, gpj
* clean up
* copied from statements
* revert
* tmp
* update warning msg
* forgot git
* add more flags
* run-slow git,codegen,gpt_neo,gpt_neox,gpj
* add cache flag to VLMs
* remove files
* style
* video LLMs also need a flag
* style
* llava will go in another PR
* style
* [run-slow] codegen, falcon, git, gpt_neo, gpt_neox, gptj, idefics
* Update src/transformers/models/gpt_neo/modeling_gpt_neo.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* copy from
* deprecate until v4.45 and warn if not training
* nit
* fix test
* test static cache
* add more tests and fix models
* fix copies
* return sliding window mask
* run slow tests & fix + codestyle
* one more falcon fix for alibi
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-08-07 10:02:16 +05:00
..
2024-07-23 17:07:31 +02:00
2022-02-23 15:46:28 -05:00
2023-10-09 11:04:57 +02:00
2024-08-01 15:18:43 -04:00
2024-06-26 21:59:08 +01:00
2024-03-19 14:43:02 +00:00
2024-06-26 14:50:08 +01:00
2024-08-07 10:02:16 +05:00
2024-08-06 16:39:52 +02:00
2024-07-11 12:11:50 +01:00
2024-02-29 03:56:16 +01:00
2024-07-29 21:24:42 +08:00
2024-07-24 17:59:59 +02:00
2023-12-07 10:00:08 +01:00
2024-07-17 10:56:44 +01:00
2024-08-05 09:22:48 +02:00
2024-07-23 15:56:41 +02:00
2024-08-05 16:33:19 +01:00
2020-01-06 15:11:12 +01:00
2023-12-20 18:33:17 +00:00
2024-07-26 10:33:02 +02:00
2023-06-15 07:30:24 -04:00
2024-08-06 11:33:05 +01:00
2024-05-21 13:56:52 +01:00
2024-08-07 10:02:16 +05:00
2024-05-16 10:56:11 +01:00
2024-05-13 15:59:46 +01:00
2024-07-22 17:46:17 +01:00
2024-06-13 16:27:16 +02:00
2023-09-05 10:12:25 +02:00
2024-08-01 14:32:13 +02:00