Arthur
2c47618c1a
🚨All attention refactor🚨 (#35235)
* refactor LlamaAttention
* minimal changes
* fix llama
* update
* modular gemmas
* modular nits
* modular updates
* nits
* simplify
* gpt2
* more modualr and fixes
* granite
* modular modular modular
* nits
* update
* qwen2 + starcoder2
* mostly gemma2
* Update image_processing_auto.py
* fix
* Update modular_starcoder2.py
* fix
* remove all copied from attentions
* remove gcv
* make fix-copies
* oups
* oups2.0
* fix some modulars + all copied from
* should be good now
* revert unwanted changes
* Update modeling_decision_transformer.py
* finish cleanup
* Update modeling_olmo.py
* consistency
* re-add gradient checkpointing attribute
* fix
* style
* make config necessary
* bis
* bis
* Update modeling_my_new_model2.py
* is_causal attr
* fix
* remove past kv return from decoder layer
* fix
* default rope config
* correctly fix rope config
* fix bias
* fix gpt2 attention output
* fix test
* fix inits
* fix default sdpa
* fix default sdpa implementation
* harmonize classes
* fix mistral
* fix sliding window models
* mixtral
* be more explicit
* style
* fix
* several fixes
* Update modeling_dbrx.py
* fix test
* olmo + phi
* rotary
* syle
* phi
* phi again
* again
* kwargs
* Update test_modeling_common.py
* skip fx tracing tests
* Update modeling_utils.py
* gemma 2
* again
* Update modeling_recurrent_gemma.py
* gemma2
* granite
* style
* starcoder
* Update sdpa_attention.py
* switch args
* Update modeling_mllama.py
* fix
* cache type tests
* gpt2
* Update test_modeling_common.py
* fix
* consistency
* fix shape with encoder
* should be the last one
* tests non model
* most comments
* small oupsi
* be more explicit in modulars
* more explicit modulars
* CIs! it works locally
* add kwargs to _flash_attention_forward
---------
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
2024-12-18 16:53:39 +01:00
..
2024-12-03 13:14:52 +01:00
2023-10-09 11:04:57 +02:00
2024-10-02 14:08:46 +01:00
2024-09-19 19:28:04 +01:00
2024-03-19 14:43:02 +00:00
2024-11-15 22:28:06 +01:00
2024-12-13 10:12:49 +01:00
2024-12-18 16:53:39 +01:00
2024-07-11 12:11:50 +01:00
2024-12-11 12:44:39 +01:00
2024-12-11 15:38:42 +01:00
2024-12-18 09:49:59 -05:00
2024-08-30 18:17:25 +02:00
2024-10-02 14:08:46 +01:00
2024-11-04 16:37:51 +01:00
2024-11-18 19:51:49 +01:00
2024-12-09 09:57:41 +01:00
2024-12-18 16:53:39 +01:00
2023-12-20 18:33:17 +00:00
2024-11-05 11:34:01 +01:00
2023-06-15 07:30:24 -04:00
2024-12-15 14:00:36 -05:00
2024-05-21 13:56:52 +01:00
2024-12-18 16:53:39 +01:00
2024-12-18 16:53:39 +01:00
2024-12-18 16:53:39 +01:00
2024-10-31 15:48:11 -04:00
2024-11-26 14:18:04 +00:00
2023-09-05 10:12:25 +02:00
2024-11-26 14:18:04 +00:00