Eduardo Pacheco
b752ad3019
Adding grounding dino (#26087)
* Fixed typo when converting weigths to GroundingDINO vision backbone
* Final modifications on modeling
* Removed unnecessary class
* Fixed convert structure
* Added image processing
* make fixup partially completed
* Now text_backbone_config has its own class
* Modified convert script
* Removed unnecessary config attribute
* Added new function to generate sub sentence mask
* Renamed parameters with gamma in the name as it's currently not allowed
* Removed tokenization and image_processing scripts since we'll map from existing models
* Fixed some issues with configuration
* Just some modifications on conversion script
* Other modifications
* Copied deformable detr
* First commit
* Added bert to model
* Bert validated
* Created Text and Fusion layers for Encoder
* Adapted Encoder layer
* Fixed typos
* Adjusted Encoder
* Converted encoder to hf
* Modified Decoder Layer
* Modified main decoder class
* Removed copy comments
* Fixed forward from GroundingDINOModel and GroundingDINODecoder
* Added all necessary layers, configurations and forward logic up to GroundingDINOModel
* Added all layers to convertion
* Fixed outputs for GroundingDINOModel and GroundingDINOForObjectDetection
* Fixed mask input to encoders and fixed nn.MultiheadAttention batch first and attn output
* Fixed forward from GroundingDINOTextEnhancerLayer
* Fixed output bug with GroundingDINODeformableLayer
* Fixed bugs that prevent GroundingDINOForObjectDetection to run forward method
* Fixed attentions to be passed correctly
* Passing temperature arg when creating Sine position embedding
* Removed copy comments
* Added temperature argument for position embedding
* Fixed typo when converting weigths to GroundingDINO vision backbone
* Final modifications on modeling
* Removed unnecessary class
* Fixed convert structure
* Added image processing
* make fixup partially completed
* Now text_backbone_config has its own class
* Modified convert script
* Removed unnecessary config attribute
* Added new function to generate sub sentence mask
* Renamed parameters with gamma in the name as it's currently not allowed
* Removed tokenization and image_processing scripts since we'll map from existing models
* Fixed some issues with configuration
* Just some modifications on conversion script
* Other modifications
* Fix style
* Improve fixup
* Improve conversion script
* Improve conversion script
* Add GroundingDINOProcessor
* More improvements
* Return token type ids
* something
* Fix more tests
* More improvements
* More cleanup
* More improvements
* Fixed tests, improved modeling and config
* More improvements and fixing tests
* Improved tests and modeling
* Improved tests and added image processor
* Improved tests inference
* More improvements
* More test improvements
* Fixed last test
* Improved docstrings and comments
* Fix style
* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
* Better naming
* Better naming
* Added Copied statement
* Added Copied statement
* Moved param init from GroundingDINOBiMultiHeadAttention
* Better naming
* Fixing clamp style
* Better naming
* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/configuration_grounding_dino.py
Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/convert_grounding_dino_to_hf.py
Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
* Improving conversion script
* Improved config
* Improved naming
* Improved naming again
* Improved grouding-dino.md
* Moved grounding dino to multimodal
* Update src/transformers/models/grounding_dino/convert_grounding_dino_to_hf.py
Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
* Fixed docstrings and style
* Fix docstrings
* Remove timm attributes
* Reorder imports
* More improvements
* Add Grounding DINO to pipeline
* Remove model from check_repo
* Added grounded post_process to GroundingDINOProcessor
* Fixed style
* Fixed GroundingDINOTextPrenetConfig docstrings
* Aligned inputs.keys() when both image and text are passed with model_input_names
* Added tests for GroundingDINOImageProcessor and GroundingDINOProcessor
* Testing post_process_grounded_object_detection from GroundingDINOProcessor at test_inference_object_detection_head
* Fixed order
* Marked test with require_torch
* Temporarily changed repo_id
* More improvements
* Fix style
* Final improvements
* Improve annotators
* Fix style
* Add is_torch_available
* Remove type hints
* vocab_tokens as one liner
* Removed print statements
* Renamed GroundingDINOTextPrenetConfig to GroundingDINOTextConfig
* remove unnecessary comments
* Removed unnecessary tests on conversion script
* Renamed GroundingDINO to camel case GroundingDino
* Fixed GroundingDinoProcessor docstrings
* loading MSDA kernels in the modeling file
* Fix copies
* Replace nn.multiheadattention
* Replace nn.multiheadattention
* Fixed inputs for GroundingDinoMultiheadAttention & order of modules
* Fixed processing to avoid messing with inputs
* Added more tips for GroundingDino
* Make style
* Chaning name to align with SAM
* Replace final nn.multiheadattention
* Fix model tests
* Update year, remove GenerationTesterMixin
* Address comments
* Address more comments
* Rename TextPrenet to TextModel
* Rename hidden_states
* Address more comments
* Address more comments
* Address comment
* Address more comments
* Address merge
* Address comment
* Address comment
* Address comment
* Make style
* Added layer norm eps to layer norms
* Address more comments
* More fixes
* Fixed equivalence
* Make fixup
* Remove print statements
* Address comments
* Address comments
* Address comments
* Address comments
* Address comments
* Address comments
* Add comment
* Address comment
* Remove overwriting of test
* Fix bbox_embed
* Improve decoder_bbox_embed_share
* Simplify outputs
* Updated post_process_grounded_object_detection
* Renamed sources to feature_maps
* Improved tests for Grounding Dino ImageProcessor and Processor
* Fixed test requirements and imports
* Fixed image_processing
* Fixed processor tests
* Fixed imports for image processing tests
* Fix copies
* Updated modeling
* Fix style
* Moved functions to correct position
* Fixed copy issues
* Update src/transformers/models/deformable_detr/modeling_deformable_detr.py
Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>
* Keeping consistency custom cuda kernels for MSDA
* Make GroundingDinoProcessor logic clearer
* Updated Grounding DINO checkpoints
* Changed tests to correct structure
* Updated gpu-cpu equivalence test
* fix copies
* Update src/transformers/models/grounding_dino/processing_grounding_dino.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/processing_grounding_dino.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/configuration_grounding_dino.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Fixed erros and style
* Fix copies
* Removed inheritance from PreTrainedModel from GroundingDinoTextModel
* Fixed GroundingDinoTextModel
* Fixed type of default backbone config
* Fixed missing methods for GroundingDinoTextModel and Added timm support for GroundingDinoConvEncoder
* Addressed comments
* Addressed batched image processing tests
* Addressed zero shot test comment
* Addressed tip comment
* Removed GroundingDinoTextModel from check_repo
* Removed inplace masking
* Addressed comments
* Addressed comments
* Addressed comments
* Fix copies
* Fixing timm test
* Fixed batching equivalence test
* Update docs/source/en/model_doc/grounding-dino.md
Co-authored-by: Tianqi Xu <40522713+dandansamax@users.noreply.github.com>
* Update docs/source/en/model_doc/grounding-dino.md
Co-authored-by: Tianqi Xu <40522713+dandansamax@users.noreply.github.com>
* Update docs/source/en/model_doc/grounding-dino.md
Co-authored-by: Tianqi Xu <40522713+dandansamax@users.noreply.github.com>
* Addressed more comments
* Added a new comment
* Reduced image size
* Addressed more comments
* Nits
* Nits
* Changed the way text_config is initialized
* Update src/transformers/models/grounding_dino/processing_grounding_dino.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: Niels <niels.rogge1@gmail.com>
Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Eduardo Pacheco <eduardo.pacheco@limehome.com>
Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Tianqi Xu <40522713+dandansamax@users.noreply.github.com>
2024-04-11 08:32:16 +01:00
..
2023-11-16 11:44:36 -08:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-15 10:13:52 -08:00
2024-03-20 15:41:03 +00:00
2023-11-03 10:57:03 -04:00
2023-12-09 05:38:14 +09:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-16 08:16:58 +01:00
2023-11-03 10:57:03 -04:00
2024-03-11 17:26:38 +00:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-12-06 10:38:21 -08:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-01-29 15:46:32 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-06 19:45:03 +00:00
2024-02-14 08:41:31 +01:00
2023-11-03 10:57:03 -04:00
2023-11-10 13:49:10 +00:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-03-15 14:29:11 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-19 15:22:29 +01:00
2023-11-03 10:57:03 -04:00
2023-12-11 18:03:42 +00:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-23 17:44:08 +00:00
2024-02-16 08:16:58 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-23 17:44:08 +00:00
2023-11-03 10:57:03 -04:00
2023-06-20 18:07:47 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-16 08:16:58 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-09-04 17:18:34 +01:00
2024-03-29 14:31:31 +00:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-03-12 10:16:21 +00:00
2024-02-21 14:21:28 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-03-28 09:31:24 +00:00
2024-02-16 08:16:58 +01:00
2023-12-09 05:38:14 +09:00
2023-11-03 10:57:03 -04:00
2023-12-09 05:38:14 +09:00
2024-01-15 09:09:22 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-04-11 08:32:16 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-02 08:45:00 +01:00
2023-11-03 10:57:03 -04:00
2024-02-02 08:45:00 +01:00
2023-10-30 21:42:19 +01:00
2023-11-03 10:57:03 -04:00
2024-02-12 10:48:31 -08:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-12 10:48:31 -08:00
2023-12-20 14:25:07 +05:30
2024-02-06 03:41:42 +01:00
2024-03-25 13:26:54 +00:00
2023-12-11 10:22:26 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-28 13:19:50 +00:00
2024-03-11 09:46:24 +01:00
2023-11-03 10:57:03 -04:00
2024-02-02 08:45:00 +01:00
2023-11-03 10:57:03 -04:00
2024-02-02 08:45:00 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-08 14:13:35 -08:00
2024-02-22 11:48:01 +01:00
2024-02-22 11:48:01 +01:00
2023-11-03 10:57:03 -04:00
2024-02-02 08:45:00 +01:00
2023-11-06 19:45:03 +00:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-01 03:53:49 +01:00
2024-03-18 13:06:12 +00:00
2024-02-12 10:48:31 -08:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-06 19:45:03 +00:00
2023-11-13 14:20:54 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-12-09 05:38:14 +09:00
2023-12-11 18:03:42 +00:00
2023-12-11 18:03:42 +00:00
2024-02-19 15:22:29 +01:00
2024-02-19 15:22:29 +01:00
2024-02-08 14:13:35 -08:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-01-22 17:15:07 +00:00
2023-11-03 10:57:03 -04:00
2024-02-26 08:42:24 -08:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-12 10:48:31 -08:00
2023-11-03 10:57:03 -04:00
2024-03-13 19:05:20 +00:00
2024-02-08 14:13:35 -08:00
2024-02-16 08:16:58 +01:00
2024-03-30 17:49:03 +01:00
2024-02-23 10:43:31 +01:00
2023-11-03 10:57:03 -04:00
2023-06-20 18:07:47 -04:00
2024-04-10 16:59:13 +02:00
2024-02-02 08:45:00 +01:00
2023-11-06 19:45:03 +00:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-07-13 11:46:54 -04:00
2023-11-06 19:45:03 +00:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-02 08:45:00 +01:00
2024-02-19 15:22:29 +01:00
2023-12-14 15:14:13 +00:00
2023-12-14 15:14:13 +00:00
2023-11-03 10:57:03 -04:00
2024-03-13 14:49:09 +00:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-19 15:22:29 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-16 08:16:58 +01:00
2023-08-03 14:12:07 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-14 07:15:18 +01:00
2024-03-06 06:58:37 +01:00
2024-03-20 15:41:03 +00:00
2023-11-03 10:57:03 -04:00
2023-06-20 18:07:47 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-06 19:45:03 +00:00
2024-02-16 08:16:58 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-06 19:45:03 +00:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-16 08:16:58 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-23 17:02:16 +00:00
2024-04-10 16:02:50 +02:00
2023-11-03 10:57:03 -04:00
2024-02-02 08:45:00 +01:00
2024-02-02 08:45:00 +01:00
2023-11-03 10:57:03 -04:00
2023-11-22 17:21:36 +01:00
2023-11-03 10:57:03 -04:00
2024-02-02 08:45:00 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-12-15 20:16:47 +01:00
2024-02-16 08:16:58 +01:00
2023-11-03 10:57:03 -04:00
2024-02-16 08:16:58 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-01-18 13:37:34 +00:00
2023-11-03 10:57:03 -04:00
2024-02-02 08:45:00 +01:00
2024-02-02 08:45:00 +01:00
2024-02-19 15:22:29 +01:00
2023-06-20 18:07:47 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00