Cyril Vallez
4ded9a4113
🚨🚨 Fix and simplify attention implementation dispatch and subconfigs handling (#39423)
* first try
* Update modeling_utils.py
* Update modeling_utils.py
* big refactor
* Update modeling_utils.py
* style
* docstrings and simplify inner workings of configs
* remove all trace of _internal
* Update modeling_utils.py
* fix logic error
* Update modeling_utils.py
* recursive on config
* Update configuration_utils.py
* fix
* Update configuration_dpt.py
* Update configuration_utils.py
* Update configuration_utils.py
* Update modeling_idefics.py
* Update modeling_utils.py
* fix for old models
* more old models fixup
* Update modeling_utils.py
* Update configuration_utils.py
* Remove outdated test
* remove the deepcopy!! 🥵🥵
* Update test_modeling_gpt_bigcode.py
* fix qwen dispatch
* restrict to only models supporting it
* style
* switch name
* Update modeling_utils.py
* Update modeling_utils.py
* add tests!
* fix
* rypo
* remove bad copies
* fix
* Update modeling_utils.py
* additional check
* Update modeling_utils.py
* Update modeling_utils.py
* Update modeling_utils.py
* Update modeling_utils.py
* Update modeling_utils.py
* fix
* skip
2025-07-18 13:41:54 +02:00
..
2025-07-18 13:41:54 +02:00
2022-02-11 16:43:54 -05:00
2022-11-08 19:54:41 +00:00
2025-07-18 13:41:54 +02:00
2022-07-19 12:02:35 +02:00
2022-02-15 09:44:35 -05:00
2025-05-12 11:55:51 +02:00