* Partial TF port for ESM model
* Add ESM-TF tests
* Add the various imports for TF-ESM
* TF weight conversion almost ready
* Stop ignoring the decoder weights in PT
* Add tests and lots of fixes
* fix-copies
* Fix imports, add model docs
* Add get_vocab() to tokenizer
* Fix vocab links for pretrained files
* Allow multiple inputs with a sep
* Use EOS as SEP token because ESM vocab lacks SEP
* Correctly return special tokens mask from ESM tokenizer
* make fixup
* Stop testing unsupported embedding resizing
* Handle TF bias correctly
* Skip all models with slow tokenizers in the token classification test
* Fixing the batch/unbatcher of pipelines to accomodate the `None` being
passed around.
* Fixing pipeline bug caused by slow tokenizer being different.
* Update src/transformers/models/esm/modeling_tf_esm.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update src/transformers/models/esm/modeling_tf_esm.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update src/transformers/models/esm/modeling_tf_esm.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update set_input_embeddings and the copyright notices
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* add suport for non fast tf bert tokenizer
* add tests for non fast tf bert tokenizer
* fix fast bert tf tokenizer flag
* double tokenizers list on tf tokenizers test to aovid breaking zip on test output equivalence
* reformat code with black to comply with code quality checks
* trigger ci
* return None to avoid recursive call
* Give error
* Give error
* Add test
* More tests
* Quality
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* First draft
* Fix more things
* Improve more things
* Remove some head models
* Fix more things
* Add missing layers
* Remove tokenizer
* Fix more things
* Fix copied from statements
* Make all tests pass
* Remove print statements
* Remove files
* Fix README and docs
* Add integration test and fix organization
* Add tips
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Make tests faster, improve docs
* Fix doc tests
* Add model to toctree
* Add docs
* Add note about creating new checkpoint
* Remove is_decoder
* Make tests smaller, add docs
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* implemented TFCvtModel and TFCvtForImageClassification and modified relevant files, added an exception in convert_tf_weight_name_to_pt_weight_name, added quick testing file to compare with pytorch model
* added docstring + testing file in transformers testing suite
* added test in testing file, modified docs to pass repo-consistency, passed formatting test
* refactoring + passing all test
* small refacto, removing unwanted comments
* improved testing config
* corrected import error
* modified acces to pretrained model archive list, to pass tf_test
* corrected import structure in init files
* modified testing for keras_fit with cpu
* correcting PR issues + Refactoring
* Refactoring : improving readability and reducing the number of permutations
* corrected momentum value + cls_token initialization
* removed from_pt as weights were added to the hub
* Update tests/models/cvt/test_modeling_tf_cvt.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* added test
* correct embedding init
* some changes in blenderbot (incomplete)
* update blenderbot (diff to be used as reference)
* update blenderbot_small
* update LED
* update marian
* update T5 and remove TFWrappedEmbeddings
* nullcontext() -> ContextManagers()
* fix embedding init
* Add `OPTForQuestionAnswering`
- added `OPTForQuestionAnswering` class based on `BloomForQuestionAnswering`
- added `OPTForQuestionAnswering` in common tests
- all common tests pass
- make fixup done
* added docstrings for OPTForQuestionAnswering
* Fix docstrings for OPTForQuestionAnswering
Ensures post_process_instance_segmentation and post_process_panoptic_segmentation methods return a tensor of shape (target_height, target_width) filled with -1 values if no segment with score > threshold is found.
* add sudachipy and jumanpp tokenizers for bert_japanese
* use ImportError instead of ModuleNotFoundError in SudachiTokenizer and JumanppTokenizer
* put test cases of test_tokenization_bert_japanese in one line
* add require_sudachi and require_jumanpp decorator for testing
* add sudachi and pyknp(jumanpp) to dependencies
* remove sudachi_dict_small and sudachi_dict_full from dependencies
* empty commit for ci
- Improves MaskFormer docs, corrects minor typos
- Restructures MaskFormerFeatureExtractor.post_process_panoptic_segmentation for better readability, adds target_sizes argument for optional resizing
- Adds post_process_semantic_segmentation and post_process_instance_segmentation methods.
- Adds a deprecation warning to post_process_segmentation method in favour of post_process_instance_segmentation
* add bloom for question answering
- attempt to add Bloom for question answering
- adapted from `GPTJForQuestionAnswering`
- Fixed `num_labels` to `2` for common tests
- Added a bit of docstring
- All common tests pass
* Update src/transformers/models/bloom/modeling_bloom.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* revert changes related to `num_labels`
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Rebase ESM PR and update all file formats
* Fix test relative imports
* Add __init__.py to the test dir
* Disable gradient checkpointing
* Remove references to TFESM... FOR NOW >:|
* Remove completed TODOs from tests
* Convert docstrings to mdx, fix-copies from BERT
* fix-copies for the README and index
* Update ESM's __init__.py to the modern format
* Add to _toctree.yml
* Ensure we correctly copy the pad_token_id from the original ESM model
* Ensure we correctly copy the pad_token_id from the original ESM model
* Tiny grammar nitpicks
* Make the layer norm after embeddings an optional flag
* Make the layer norm after embeddings an optional flag
* Update the conversion script to handle other model classes
* Remove token_type_ids entirely, fix attention_masking and add checks to convert_esm.py
* Break the copied from link from BertModel.forward to remove token_type_ids
* Remove debug array saves
* Begin ESM-2 porting
* Add a hacky workaround for the precision issue in original repo
* Code cleanup
* Remove unused checkpoint conversion code
* Remove unused checkpoint conversion code
* Fix copyright notices
* Get rid of all references to the TF weights conversion
* Remove token_type_ids from the tests
* Fix test code
* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Add credit
* Remove _ args and __ kwargs in rotary embedding
* Assertively remove asserts
* Replace einsum with torch.outer()
* Fix docstring formatting
* Remove assertions in tokenization
* Add paper citation to ESMModel docstring
* Move vocab list to single line
* Remove ESMLayer from init
* Add Facebook copyrights
* Clean up RotaryEmbedding docstring
* Fix docstring formatting
* Fix docstring for config object
* Add explanation for new config methods
* make fix-copies
* Rename all the ESM- classes to Esm-
* Update conversion script to allow pushing to hub
* Update tests to point at my repo for now
* Set config properly for tests
* Remove the gross hack that forced loss of precision in inv_freq and instead copy the data from the model being converted
* make fixup
* Update expected values for slow tests
* make fixup
* Remove EsmForCausalLM for now
* Remove EsmForCausalLM for now
* Fix padding idx test
* Updated README and docs with ESM-1b and ESM-2 separately (#19221)
* Updated README and docs with ESM-1b and ESM-2 separately
* Update READMEs, longer entry with 3 citations
* make fix-copies
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Tom Sercu <tsercu@fb.com>
Co-authored-by: Your Name <you@example.com>
* chore: initial commit
* chore: adding util methods
yet to work on the nn.functional.interpolate port with align_corener=True
* chore: refactor the utils
* used tf.compat.v1.image.resize to align the F.interpolate function
* added type hints to the method signatures
* added references to the gists where one 2 one alignment of torch and tf has been shown
* chore: adding the layers
* chore: porting all the layers from torch to tf
This is the initial draft, nothing is tested yet.
* chore: aligning the layers with reference to tf clip
* chore: aligning the modules
* added demaraction comments
* added copied and adapted from comments
* chore: aligning with CLIP
* chore: wrangling the layers to keep it tf compatible
* chore: aligning the names of the layers for porting
* chore: style changes
* chore: adding docs and inits
* chore: adding tfp dependencis
the code is taken from TAPAS
* chore: initial commit for testing
* chore: aligning the vision embeddings with the vit implementatino
* chore: changing model prefix
* chore: fixing the name of the model and the layer normalization test case
* chore: every test passes but the slow ones
* chore: fix style and integration test
* chore: moving comments below decorators
* chore: make fixup and fix-copies changes
* chore: adding the Vision and Text Model to check_repo
* chore: modifying the prefix name to align it with the torch implementation
* chore: fix typo in configuration
* choer: changing the name of the model variable
* chore: adding segmentation flag
* chore: gante's review
* chore: style refactor
* chore: amy review
* chore: adding shape_list to parts that have been copied from other snippets
* chore: init batchnorm with torch defaults
* chore: adding shape_list to pass the tests
* test fix: adding seed as 0
* set seed
* chore: changing the straight through trick to fix -ve dimensinos
* chore: adding a dimension to the loss
* chore: adding reviewers and contributors names to the docs
* chore: added changes after review
* chore: code quality fixup
* chore: fixing the segmentation snippet
* chore: adding to the layer calls
* chore: changing int32 to int64 for inputs of serving
* chore: review changes
* chore: style changes
* chore: remove from_pt=True
* fix: repo consistency
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>