Refactor Attention implementation for ViT-based models (#36545)
* Refactor vit attention * Refactor ViT-based models * 🚨🚨🚨 Fix prefix for DPT * Update params order * trigger tests * Fix Dinov2 attention * Fix DPT attention impl propagation for backbone config * Common test fix: config is modif. inplace - avoid it * view->reshape * Fixup * Fixup * Enable IJepa FA2 * Add FA2 in corresponding model docs
This commit is contained in:
committed by
GitHub
parent
730d2a52e7
commit
66291778dd
@@ -2098,7 +2098,9 @@ class PreTrainedModel(nn.Module, ModuleUtilsMixin, GenerationMixin, PushToHubMix
|
||||
if not isinstance(requested_attn_implementation, dict)
|
||||
else requested_attn_implementation.get(key, None)
|
||||
)
|
||||
sub_config._attn_implementation_internal = curr_attn_implementation
|
||||
# For models with backbone sub-config might be not initialized
|
||||
if sub_config is not None:
|
||||
sub_config._attn_implementation_internal = curr_attn_implementation
|
||||
|
||||
if use_flash_attention_2:
|
||||
logger.warning_once(
|
||||
|
||||
Reference in New Issue
Block a user