Eduardo Pacheco
b752ad3019
Adding grounding dino (#26087)
* Fixed typo when converting weigths to GroundingDINO vision backbone
* Final modifications on modeling
* Removed unnecessary class
* Fixed convert structure
* Added image processing
* make fixup partially completed
* Now text_backbone_config has its own class
* Modified convert script
* Removed unnecessary config attribute
* Added new function to generate sub sentence mask
* Renamed parameters with gamma in the name as it's currently not allowed
* Removed tokenization and image_processing scripts since we'll map from existing models
* Fixed some issues with configuration
* Just some modifications on conversion script
* Other modifications
* Copied deformable detr
* First commit
* Added bert to model
* Bert validated
* Created Text and Fusion layers for Encoder
* Adapted Encoder layer
* Fixed typos
* Adjusted Encoder
* Converted encoder to hf
* Modified Decoder Layer
* Modified main decoder class
* Removed copy comments
* Fixed forward from GroundingDINOModel and GroundingDINODecoder
* Added all necessary layers, configurations and forward logic up to GroundingDINOModel
* Added all layers to convertion
* Fixed outputs for GroundingDINOModel and GroundingDINOForObjectDetection
* Fixed mask input to encoders and fixed nn.MultiheadAttention batch first and attn output
* Fixed forward from GroundingDINOTextEnhancerLayer
* Fixed output bug with GroundingDINODeformableLayer
* Fixed bugs that prevent GroundingDINOForObjectDetection to run forward method
* Fixed attentions to be passed correctly
* Passing temperature arg when creating Sine position embedding
* Removed copy comments
* Added temperature argument for position embedding
* Fixed typo when converting weigths to GroundingDINO vision backbone
* Final modifications on modeling
* Removed unnecessary class
* Fixed convert structure
* Added image processing
* make fixup partially completed
* Now text_backbone_config has its own class
* Modified convert script
* Removed unnecessary config attribute
* Added new function to generate sub sentence mask
* Renamed parameters with gamma in the name as it's currently not allowed
* Removed tokenization and image_processing scripts since we'll map from existing models
* Fixed some issues with configuration
* Just some modifications on conversion script
* Other modifications
* Fix style
* Improve fixup
* Improve conversion script
* Improve conversion script
* Add GroundingDINOProcessor
* More improvements
* Return token type ids
* something
* Fix more tests
* More improvements
* More cleanup
* More improvements
* Fixed tests, improved modeling and config
* More improvements and fixing tests
* Improved tests and modeling
* Improved tests and added image processor
* Improved tests inference
* More improvements
* More test improvements
* Fixed last test
* Improved docstrings and comments
* Fix style
* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
* Better naming
* Better naming
* Added Copied statement
* Added Copied statement
* Moved param init from GroundingDINOBiMultiHeadAttention
* Better naming
* Fixing clamp style
* Better naming
* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/configuration_grounding_dino.py
Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/convert_grounding_dino_to_hf.py
Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
* Improving conversion script
* Improved config
* Improved naming
* Improved naming again
* Improved grouding-dino.md
* Moved grounding dino to multimodal
* Update src/transformers/models/grounding_dino/convert_grounding_dino_to_hf.py
Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
* Fixed docstrings and style
* Fix docstrings
* Remove timm attributes
* Reorder imports
* More improvements
* Add Grounding DINO to pipeline
* Remove model from check_repo
* Added grounded post_process to GroundingDINOProcessor
* Fixed style
* Fixed GroundingDINOTextPrenetConfig docstrings
* Aligned inputs.keys() when both image and text are passed with model_input_names
* Added tests for GroundingDINOImageProcessor and GroundingDINOProcessor
* Testing post_process_grounded_object_detection from GroundingDINOProcessor at test_inference_object_detection_head
* Fixed order
* Marked test with require_torch
* Temporarily changed repo_id
* More improvements
* Fix style
* Final improvements
* Improve annotators
* Fix style
* Add is_torch_available
* Remove type hints
* vocab_tokens as one liner
* Removed print statements
* Renamed GroundingDINOTextPrenetConfig to GroundingDINOTextConfig
* remove unnecessary comments
* Removed unnecessary tests on conversion script
* Renamed GroundingDINO to camel case GroundingDino
* Fixed GroundingDinoProcessor docstrings
* loading MSDA kernels in the modeling file
* Fix copies
* Replace nn.multiheadattention
* Replace nn.multiheadattention
* Fixed inputs for GroundingDinoMultiheadAttention & order of modules
* Fixed processing to avoid messing with inputs
* Added more tips for GroundingDino
* Make style
* Chaning name to align with SAM
* Replace final nn.multiheadattention
* Fix model tests
* Update year, remove GenerationTesterMixin
* Address comments
* Address more comments
* Rename TextPrenet to TextModel
* Rename hidden_states
* Address more comments
* Address more comments
* Address comment
* Address more comments
* Address merge
* Address comment
* Address comment
* Address comment
* Make style
* Added layer norm eps to layer norms
* Address more comments
* More fixes
* Fixed equivalence
* Make fixup
* Remove print statements
* Address comments
* Address comments
* Address comments
* Address comments
* Address comments
* Address comments
* Add comment
* Address comment
* Remove overwriting of test
* Fix bbox_embed
* Improve decoder_bbox_embed_share
* Simplify outputs
* Updated post_process_grounded_object_detection
* Renamed sources to feature_maps
* Improved tests for Grounding Dino ImageProcessor and Processor
* Fixed test requirements and imports
* Fixed image_processing
* Fixed processor tests
* Fixed imports for image processing tests
* Fix copies
* Updated modeling
* Fix style
* Moved functions to correct position
* Fixed copy issues
* Update src/transformers/models/deformable_detr/modeling_deformable_detr.py
Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>
* Keeping consistency custom cuda kernels for MSDA
* Make GroundingDinoProcessor logic clearer
* Updated Grounding DINO checkpoints
* Changed tests to correct structure
* Updated gpu-cpu equivalence test
* fix copies
* Update src/transformers/models/grounding_dino/processing_grounding_dino.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/processing_grounding_dino.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/grounding_dino/configuration_grounding_dino.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Fixed erros and style
* Fix copies
* Removed inheritance from PreTrainedModel from GroundingDinoTextModel
* Fixed GroundingDinoTextModel
* Fixed type of default backbone config
* Fixed missing methods for GroundingDinoTextModel and Added timm support for GroundingDinoConvEncoder
* Addressed comments
* Addressed batched image processing tests
* Addressed zero shot test comment
* Addressed tip comment
* Removed GroundingDinoTextModel from check_repo
* Removed inplace masking
* Addressed comments
* Addressed comments
* Addressed comments
* Fix copies
* Fixing timm test
* Fixed batching equivalence test
* Update docs/source/en/model_doc/grounding-dino.md
Co-authored-by: Tianqi Xu <40522713+dandansamax@users.noreply.github.com>
* Update docs/source/en/model_doc/grounding-dino.md
Co-authored-by: Tianqi Xu <40522713+dandansamax@users.noreply.github.com>
* Update docs/source/en/model_doc/grounding-dino.md
Co-authored-by: Tianqi Xu <40522713+dandansamax@users.noreply.github.com>
* Addressed more comments
* Added a new comment
* Reduced image size
* Addressed more comments
* Nits
* Nits
* Changed the way text_config is initialized
* Update src/transformers/models/grounding_dino/processing_grounding_dino.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: Niels <niels.rogge1@gmail.com>
Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Eduardo Pacheco <eduardo.pacheco@limehome.com>
Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Tianqi Xu <40522713+dandansamax@users.noreply.github.com>
2024-04-11 08:32:16 +01:00
..
2022-02-23 15:46:28 -05:00
2023-10-09 11:04:57 +02:00
2024-04-09 12:01:47 +05:30
2024-03-13 17:44:35 +00:00
2024-03-19 14:43:02 +00:00
2024-03-13 22:03:02 +05:30
2024-04-10 12:45:07 +05:00
2024-04-11 08:32:16 +01:00
2023-03-02 12:08:43 -05:00
2024-02-29 03:56:16 +01:00
2024-04-09 11:04:18 +01:00
2024-04-09 17:10:29 +02:00
2023-12-07 10:00:08 +01:00
2024-02-16 08:16:58 +01:00
2024-03-25 10:33:38 +01:00
2023-06-26 09:58:14 -04:00
2024-03-30 22:19:17 -04:00
2024-04-03 09:25:01 +02:00
2020-01-06 15:11:12 +01:00
2023-12-20 18:33:17 +00:00
2024-03-06 10:57:04 +00:00
2023-11-15 14:10:39 +01:00
2024-03-15 14:18:41 +00:00
2023-06-15 07:30:24 -04:00
2024-03-15 14:18:41 +00:00
2024-02-20 16:20:20 +01:00
2024-03-15 14:18:41 +00:00
2023-11-10 15:35:27 +00:00
2024-04-09 13:28:54 +02:00
2024-01-26 18:20:39 +00:00
2024-01-23 10:28:23 +01:00
2024-01-30 17:26:36 +00:00
2024-03-21 14:04:11 +00:00
2024-04-10 14:46:39 +02:00
2024-02-05 14:50:07 +00:00
2024-01-19 09:59:14 +00:00
2023-09-05 10:12:25 +02:00
2024-04-05 09:07:41 +02:00
2024-03-15 14:18:41 +00:00