Multiple llama4 fixe (#37353)
* update for fixes * more fixes * fuxix dynamic cache? * style * fix both traiining and generating. Eager seems alright * dynamic does not work * fix most cases, use_cache or not, eager or not, no default cache (ex: not training but you want to get cache states) * should be final fixes * fix more stuff no cat * style * fix * style * final sytle * qualityeioiwhjfaopsejdpofqsdjkfjha;wesdhgfkjlqsw.denghjkaswednkgs * fix * revert
This commit is contained in:
@@ -244,6 +244,7 @@ SPECIAL_CASES_TO_ALLOW = {
|
||||
"output_router_logits",
|
||||
"router_aux_loss_coef",
|
||||
"router_jitter_noise",
|
||||
"cache_implementation",
|
||||
],
|
||||
"Llama4VisionConfig": ["multi_modal_projector_bias", "norm_eps"],
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user