* feat: add support for sdpa and gradient checkpointing
* fix: ruff format
* fix: config sdpa
* fix: sdpa layer naming convention
* fix: update test_eager_matches_sdpa_inference to handle vision_hidden_states
* test: skip incompatible tests and fix loading issue with sdpa
- Updated tests to skip cases flash and dynamic compile.
- Minor adjustment to ensure correct loading of model with sdpa for dispatch test.
* style: apply Ruff formatting
* ruff fix again after rebase
* [run-slow] sam
* [run-slow] sam
* refactor: Address review comments and improve sub-config handling in SAM model tests
- Added attributes for sub_configs as per PR #34410.
- Enabled tests for configs, ensuring the composite model (SAM) has several sub-configs in the main config.
- Added class attribute _is_composite=True to the tester class
- test_sdpa_can_dispatch_composite_models added
* [run-slow] sam
* style: ruff
* [run-slow] sam
* style: ruff again ...
* [run-slow] sam
* feat: multidevice for resnet
* feat: yes! resnet
* fix: compare all elements in tuple
* feat: support for regnet
* feat: support for convnextv2
* feat: support for bit
* feat: support for cvt
* feat: add support for focalnet
* feat: support for yolos
* feat: support for glpn
* feat: support for imagegpt
* feat: support for levit
* feat: support for mgp_str
* feat: support for mobilnet_v1
* feat: support for mobilnet_v2
* feat: support for mobilevit
* feat: support for mobilevitv2
* feat: support for poolformer
* fix: copies
* fix: code quality check
* update: upstream changes from main
* fix: consistency check
* feat: support for sam
* feat: support for switchformer
* feat: support for swin
* feat: support for swinv2
* feat: support for timesformer
* feat: suport for trocr
* feat: support for upernet
* fix: check copies
* update: rerun CI
* update: rerun again, maybe
* update: one more rerun
---------
Co-authored-by: Jacky Lee <jackylee328@gmail.com>
* Changed type hinting for all attention inputs to 'Optional[Tuple[torch.FloatTensor,...]] = None'
* Fixed the ruff formatting issue
* fixed type hinting for all hidden_states to 'Optional[Tuple[torch.FloatTensor, ...]] = None'
* Changed type hinting in these 12 scripts modeling_dpr.py,modeling_nat.py,idefics/vision.py,modeling_tf_dpr.py,modeling_luke.py,modeling_swin.py,modeling_tf_swin.py,modeling_blip.py,modeling_tf_blip.py,modeling_donut_swin.py,modeling_dinat.py,modeling_swinv2.py
* test fail update
* fixed type hinting for these 15 scripts modeling_xlnet.py,modeling_tf_xlnet.py,modeling_led.py,modeling_tf_led.py,modleing_rwkv.py,modeling_dpt.py,modeling_tf_cvt.py,modeling_clip.py,modeling_flax_clip.py,modeling_tf_clip.py,modeling_longformer.py,modeling_tf_longformer.py,modeling_siglip.py,modeling_clap.py,modeling_git.py
* Changed type hinting in these 12 scripts modeling_dpr.py,modeling_nat.py,idefics/vision.py,modeling_tf_dpr.py,modeling_luke.py,modeling_swin.py,modeling_tf_swin.py,modeling_blip.py,modeling_tf_blip.py,modeling_donut_swin.py,modeling_dinat.py,modeling_swinv2.py
* test fail update
* Removed the myvenv file
* Fixed type hinting for these 8 scripts modeling_tvlt.py,modeling_sam.py,modeling_tf_sam.py,modeling_tvp.py,modeling_rag.py,modeling_tf_rag.py,modeling_tf_xlm.py,modeling_xlm.py
* fix
* more fixes
* fix other models
* fix long t5
* use `gradient_checkpointing_func` instead
* fix copies
* set `gradient_checkpointing_func` as a private attribute and retrieve previous behaviour
* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* replace it with `is_gradient_checkpointing_set`
* remove default
* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fixup
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Add type hints for MGP STR model
* Add missing type hints for plbart model
* Add type hints for Pix2struct model
* Add missing type hints to Rag model and tweak the docstring
* Add missing type hints to Sam model
* Add missing type hints to Swin2sr model
* Fix a type hint for Pix2StructTextModel
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* Fix typo on Rag model docstring
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* Fix linter
---------
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* Preliminary work on some models
* Fix test load missing and make sure nonpersistent buffers are tested
* Always ignore nonpersistent buffers if in state_dict
* Treat models
* More models
* Treat remaining models
* Fix quality
* Fix tests
* Remove draft
* This test is not needed anymore
* Fix copies
* Fix last test
* Newly added models
* Fix last tests
* Address review comments
* First test
* Add info for all models
* style
* Repo consistency
* Fix last model and cleanup prints
* Repo consistency
* Use consistent function for detecting tied weights
* First commit
* Add auto-translation with GPT-4
* make fixup
* Add a functional layernorm for TF
* Add all the auxiliary imports etc.
* Add the extra processor and tests
* rebase to main
* Add all the needed fixes to the GPT code
* make fixup
* Make convolutions channels-last so they run on CPU
* make fixup
* Fix final issues
* Fix other models affected by test change
* Clarify comment on the sparse_prompt_embeddings check
* Refactor functional_layernorm, use shape_list in place of .shape in some places
* Remove deprecated torch-alike code
* Update tests/models/sam/test_modeling_tf_sam.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/sam/test_modeling_tf_sam.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Refactor processor with common methods and separated private methods
* make fixup
* Quietly delete the file that didn't do anything (sorry Sylvain)
* Refactor the processor tests into one file
* make fixup
* Clean up some unnecessary indirection
* Fix TF mask postprocessing
* Add more processor equivalence tests
* Refactor generate_crop_boxes to use framework-neutral np code
* Make the serving output correctly conditional
* Fix error message line length
* Use dict keys rather than indices internally in both TF and PT SAM call/forward
* Return dicts internally in the call/forward methods
* Revert changes to common tests and just override check_pt_tf_outputs
* Revert changes to other model tests
* Clarify comments for functional layernorm
* Add missing transpose from PT code
* Removed unused copied from in PT code
* Remove overrides for tests that don't exist in TF
* Fix transpose and update tests for PT and TF to check pred_masks
* Add training flag
* Update tests to use TF checkpoints
* Update index.mdx
* Add missing cross-test decorator
* Remove optional extra asterisks
* Revert return_dict changes in PT code
* Update src/transformers/models/sam/modeling_tf_sam.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Remove None return annotations on init methods
* Update tests/models/sam/test_processor_sam.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Fix input_boxes shapes
* make fixup
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>