Cyril Vallez
4ded9a4113
🚨🚨 Fix and simplify attention implementation dispatch and subconfigs handling (#39423)
* first try
* Update modeling_utils.py
* Update modeling_utils.py
* big refactor
* Update modeling_utils.py
* style
* docstrings and simplify inner workings of configs
* remove all trace of _internal
* Update modeling_utils.py
* fix logic error
* Update modeling_utils.py
* recursive on config
* Update configuration_utils.py
* fix
* Update configuration_dpt.py
* Update configuration_utils.py
* Update configuration_utils.py
* Update modeling_idefics.py
* Update modeling_utils.py
* fix for old models
* more old models fixup
* Update modeling_utils.py
* Update configuration_utils.py
* Remove outdated test
* remove the deepcopy!! 🥵🥵
* Update test_modeling_gpt_bigcode.py
* fix qwen dispatch
* restrict to only models supporting it
* style
* switch name
* Update modeling_utils.py
* Update modeling_utils.py
* add tests!
* fix
* rypo
* remove bad copies
* fix
* Update modeling_utils.py
* additional check
* Update modeling_utils.py
* Update modeling_utils.py
* Update modeling_utils.py
* Update modeling_utils.py
* Update modeling_utils.py
* fix
* skip
2025-07-18 13:41:54 +02:00
..
2025-07-16 12:45:46 +02:00
2025-06-13 11:07:09 +00:00
2025-07-18 00:02:04 +00:00
2025-07-07 13:12:02 +00:00
2025-05-23 16:39:47 +00:00
2025-06-17 19:37:18 +01:00
2024-11-28 16:04:05 +01:00
2024-05-28 18:29:22 +02:00
2025-07-18 00:02:04 +00:00
2025-06-13 15:32:40 +00:00
2025-05-30 16:05:07 +00:00
2025-06-26 12:25:14 -07:00
2025-03-03 10:33:46 -08:00
2025-07-03 17:04:16 +01:00
2025-07-18 13:41:54 +02:00
2025-06-26 14:21:54 -07:00
2025-03-03 10:33:46 -08:00
2025-07-16 14:00:17 +02:00
2025-03-07 13:09:02 +00:00
2025-06-30 07:56:55 -07:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2024-09-09 10:47:24 +02:00
2025-07-03 17:04:16 +01:00
2025-06-17 19:37:18 +01:00
2025-03-03 10:33:46 -08:00
2025-03-04 13:47:41 +00:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-06-30 07:56:55 -07:00
2025-03-03 10:33:46 -08:00
2025-03-04 13:47:41 +00:00
2025-06-20 17:36:57 +01:00
2025-03-11 15:29:14 +01:00
2025-06-13 11:07:09 +00:00
2025-05-19 10:37:54 -07:00
2025-06-05 14:07:23 -07:00
2025-05-12 11:55:51 +02:00
2025-06-13 12:02:27 -07:00
2025-04-07 15:19:47 +02:00
2025-07-07 13:12:02 +00:00
2025-04-03 14:15:53 +01:00
2025-06-13 11:07:09 +00:00
2025-06-17 19:37:18 +01:00
2025-06-13 11:07:09 +00:00
2025-03-03 10:33:46 -08:00
2025-06-25 14:55:22 +00:00
2025-06-26 14:21:54 -07:00
2025-07-16 13:35:53 +02:00
2024-09-09 10:47:24 +02:00
2025-03-03 10:33:46 -08:00
2025-07-16 12:15:15 -07:00
2025-06-06 20:04:44 +02:00
2025-06-30 08:54:05 -07:00
2025-05-06 14:32:55 +01:00
2025-03-03 10:33:46 -08:00
2025-06-06 20:04:44 +02:00
2025-06-06 20:04:44 +02:00
2025-04-29 13:28:06 -07:00
2025-06-26 14:40:45 -07:00
2025-06-23 12:33:10 -07:00
2025-03-03 10:33:46 -08:00
2024-11-26 09:23:34 -08:00
2023-11-06 19:45:03 +00:00
2025-03-03 10:33:46 -08:00
2025-03-04 13:47:41 +00:00
2025-04-15 08:35:05 -07:00
2024-09-09 10:47:24 +02:00
2025-05-19 13:16:35 +00:00
2025-06-24 11:48:15 -07:00
2025-03-11 13:47:38 +00:00
2025-03-03 10:33:46 -08:00
2025-07-17 14:29:57 +00:00
2025-06-25 17:29:10 +00:00
2025-03-03 10:33:46 -08:00
2025-06-13 11:07:09 +00:00
2025-07-03 17:04:16 +01:00
2025-05-08 16:47:45 +01:00
2025-07-14 09:25:06 -07:00
2025-03-03 10:33:46 -08:00
2024-02-16 08:16:58 +01:00
2025-05-12 11:55:51 +02:00