Commit Graph

1214 Commits

Author SHA1 Message Date
Stella Biderman
c02cd95c56 GPT-J-6B (#13022)
* Test GPTJ implementation

* Fixed conflicts

* Update __init__.py

* Update __init__.py

* change GPT_J to GPTJ

* fix missing imports and typos

* use einops for now
(need to change to torch ops later)

* Use torch ops instead of einsum

* remove einops deps

* Update configuration_auto.py

* Added GPT J

* Update gptj.rst

* Update __init__.py

* Update test_modeling_gptj.py

* Added GPT J

* Changed configs to match GPT2 instead of GPT Neo

* Removed non-existent sequence model

* Update configuration_auto.py

* Update configuration_auto.py

* Update configuration_auto.py

* Update modeling_gptj.py

* Update modeling_gptj.py

* Progress on updating configs to agree with GPT2

* Update modeling_gptj.py

* num_layers -> n_layer

* layer_norm_eps -> layer_norm_epsilon

* attention_layers -> num_hidden_layers

* Update modeling_gptj.py

* attention_pdrop -> attn_pdrop

* hidden_act -> activation_function

* Update configuration_gptj.py

* Update configuration_gptj.py

* Update configuration_gptj.py

* Update configuration_gptj.py

* Update configuration_gptj.py

* Update modeling_gptj.py

* Update modeling_gptj.py

* Update modeling_gptj.py

* Update modeling_gptj.py

* Update modeling_gptj.py

* Update modeling_gptj.py

* fix layernorm and lm_head size
delete attn_type

* Update docs/source/model_doc/gptj.rst

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* removed claim that GPT J uses local attention

* Removed GPTJForSequenceClassification

* Update src/transformers/models/gptj/configuration_gptj.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Removed unsupported boilerplate

* Update tests/test_modeling_gptj.py

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update src/transformers/models/gptj/modeling_gptj.py

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update src/transformers/models/gptj/modeling_gptj.py

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update src/transformers/models/gptj/modeling_gptj.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update tests/test_modeling_gptj.py

Co-authored-by: Eric Hallahan <eric@hallahans.name>

* Update tests/test_modeling_gptj.py

Co-authored-by: Eric Hallahan <eric@hallahans.name>

* Update tests/test_modeling_gptj.py

Co-authored-by: Eric Hallahan <eric@hallahans.name>

* Update src/transformers/models/gptj/modeling_gptj.py

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update __init__.py

* Update configuration_gptj.py

* Update modeling_gptj.py

* Corrected indentation

* Remove stray backslash

* Delete .DS_Store

* Delete .DS_Store

* Delete .DS_Store

* Delete .DS_Store

* Delete .DS_Store

* Update docs to match

* Remove tf loading

* Remove config.jax

* Remove stray `else:` statement

* Remove references to `load_tf_weights_in_gptj`

* Adapt tests to match output from GPT-J 6B

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Default `activation_function` to `gelu_new`

- Specify the approximate formulation of GELU to ensure parity with the default setting of `jax.nn.gelu()`

* Fix part of the config documentation

* Revert "Update configuration_auto.py"

This reverts commit e9860e9c043b6ebf57a0e705044e9ec9ba2263bb.

* Revert "Update configuration_auto.py"

This reverts commit cfaaae4c4dc70f1fbe9abd60fc8bd0b863b8c011.

* Revert "Update configuration_auto.py"

This reverts commit 687788954fd0cfbc567fa1202d56a4ff9271944f.

* Revert "Update configuration_auto.py"

This reverts commit 194d024ea87d4fcef0dcb08e57f52c47511a9fc6.

* Hyphenate GPT-J

* Undid sorting of the models alphabetically

* Reverting previous commit

* fix style and quality issues

* Update docs/source/model_doc/gptj.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/test_modeling_gptj.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/configuration_gptj.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/configuration_gptj.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/configuration_gptj.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Replaced GPTJ-specific code with generic code

* Update src/transformers/models/gptj/modeling_gptj.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Made the code always use rotary positional encodings

* Update index.rst

* Fix documentation

* Combine attention classes

- Condense all attention operations into `GPTJAttention`
- Replicate GPT-2 and improve code clarity by renaming `GPTJAttention.attn_pdrop` and `GPTJAttention.resid_pdrop` to `GPTJAttention.attn_dropout` and `GPTJAttention.resid_dropout`

* Removed `config.rotary_dim` from tests

* Update test_modeling_gptj.py

* Update test_modeling_gptj.py

* Fix formatting

* Removed depreciated argument `layer_id` to `GPTJAttention`

* Update modeling_gptj.py

* Update modeling_gptj.py

* Fix code quality

* Restore model functionality

* Save `lm_head.weight` in checkpoints

* Fix crashes when loading with reduced precision

* refactor self._attn(...)` and rename layer weights"

* make sure logits are in fp32 for sampling

* improve docs

* Add `GPTJForCausalLM` to `TextGenerationPipeline` whitelist

* Added GPT-J to the README

* Fix doc/readme consistency

* Add rough parallelization support

- Remove unused imports and variables
- Clean up docstrings
- Port experimental parallelization code from GPT-2 into GPT-J

* Clean up loose ends

* Fix index.rst

Co-authored-by: kurumuz <kurumuz1@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Eric Hallahan <eric@hallahans.name>
Co-authored-by: Leo Gao <54557097+leogao2@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: your_github_username <your_github_email>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-08-31 17:53:02 +02:00
Matt
854260ca44 TF/Numpy variants for all DataCollator classes (#13105)
* Adding a TF variant of the DataCollatorForTokenClassification to get feedback

* Added a Numpy variant and a post_init check to fail early if a missing import is found

* Fixed call to Numpy variant

* Added a couple more of the collators

* Update src/transformers/data/data_collator.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fixes, style pass, finished DataCollatorForSeqToSeq

* Added all the LanguageModeling DataCollators, except SOP and PermutationLanguageModeling

* Adding DataCollatorForPermutationLanguageModeling

* Style pass

* Add missing `__call__` for PLM

* Remove `post_init` checks for frameworks because the imports inside them were making us fail code quality checks

* Remove unused imports

* First attempt at some TF tests

* A second attempt to make any of those tests actually work

* TF tests, round three

* TF tests, round four

* TF tests, round five

* TF tests, all enabled!

* Style pass

* Merging tests into `test_data_collator.py`

* Merging tests into `test_data_collator.py`

* Fixing up test imports

* Fixing up test imports

* Trying shuffling the conditionals around

* Commenting out non-functional old tests

* Completed all tests for all three frameworks

* Style pass

* Fixed test typo

* Style pass

* Move standard `__call__` method to mixin

* Rearranged imports for `test_data_collator`

* Fix data collator typo "torch" -> "pt"

* Fixed the most embarrassingly obvious bug

* Update src/transformers/data/data_collator.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Renaming mixin

* Updating docs

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Dalton Walker <dalton_walker@icloud.com>
Co-authored-by: Andrew Romans <andrew.romans@hotmail.com>
2021-08-31 13:06:48 +01:00
Sylvain Gugger
74b3344fbc Clean up test file 2021-08-31 07:06:49 -04:00
Kamal Raj
3efcfeab67 Deberta_v2 tf (#13120)
* Deberta_v2 tf

* added new line at the end of file, make style

* +V2, typo

* remove never executed branch of code

* rm cmnt and fixed typo in url filter

* cleanup according to review comments

* added #Copied from
2021-08-31 06:32:47 -04:00
tucan9389
41c559415a Add GPT2ForTokenClassification (#13290)
* Add GPT2ForTokenClassification

* Fix dropout exception for GPT2 NER

* Remove sequence label in test

* Change TokenClassifierOutput to TokenClassifierOutputWithPast

* Fix for black formatter

* Remove dummy

* Update docs for GPT2ForTokenClassification

* Fix check_inits ci fail

* Update dummy_pt_objects after make fix-copies

* Remove TokenClassifierOutputWithPast

* Fix tuple input issue

Co-authored-by: danielsejong55@gmail.com <danielsejong55@gmail.com>
2021-08-31 12:19:04 +02:00
Sylvain Gugger
8b2de0e483 Tests fetcher tests (#13340)
* Incorporate tests dependencies in tests_fetcher

* Harder modif

* Debug

* Loop through all files

* Last modules

* Remove debug statement
2021-08-31 03:57:01 -04:00
Olatunji Ruwase
42f359d015 Use DS callable API to allow hf_scheduler + ds_optimizer (#13216)
* Use DS callable API to allow hf_scheduler + ds_optimizer

* Preserve backward-compatibility

* Restore backward compatibility

* Tweak arg positioning

* Tweak arg positioning

* bump the required version

* Undo indent

* Update src/transformers/trainer.py

* style

Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-08-30 10:01:06 -07:00
Laura Hanu
35236b870e Add missing module __spec__ (#13321)
* added missing __spec__ to _LazyModule

* test __spec__ is not None after module import

* changed module_spec arg to be optional in _LazyModule

* fix style issue

* added module spec test to test_file_utils
2021-08-30 12:39:05 -04:00
Sylvain Gugger
c4ecd234f2 Fix AutoTokenizer when no fast tokenizer is available (#13336)
* Fix AutoTokenizer when a tokenizer has no fast version

* Add test
2021-08-30 11:55:18 -04:00
Kamal Raj
98e409abb3 albert flax (#13294)
* albert flax

* year -> 2021

* docstring updated for flax

* removed head_mask

* removed from_pt

* removed passing attention_mask to embedding layer
2021-08-30 17:29:27 +02:00
Kamal Raj
774760e6f3 distilbert-flax (#13324)
* distilbert-flax

* added missing self

* docs fix

* removed tied kernal extra init

* updated docs

* x -> hidden states

* removed head_mask

* removed from_pt, +FLAX

* updated year
2021-08-30 14:16:18 +02:00
NielsRogge
b6ddb08a66 Add LayoutLMv2 + LayoutXLM (#12604)
* First commit

* Make style

* Fix dummy objects

* Add Detectron2 config

* Add LayoutLMv2 pooler

* More improvements, add documentation

* More improvements

* Add model tests

* Add clarification regarding image input

* Improve integration test

* Fix bug

* Fix another bug

* Fix another bug

* Fix another bug

* More improvements

* Make more tests pass

* Make more tests pass

* Improve integration test

* Remove gradient checkpointing and add head masking

* Add integration test

* Add LayoutLMv2ForSequenceClassification to the tests

* Add LayoutLMv2ForQuestionAnswering

* More improvements

* More improvements

* Small improvements

* Fix _LazyModule

* Fix fast tokenizer

* Move sync_batch_norm to a separate method

* Replace dummies by requires_backends

* Move calculation of visual bounding boxes to separate method + update README

* Add models to main init

* First draft

* More improvements

* More improvements

* More improvements

* More improvements

* More improvements

* Remove is_split_into_words

* More improvements

* Simply tesseract - no use of pandas anymore

* Add LayoutLMv2Processor

* Update is_pytesseract_available

* Fix bugs

* Improve feature extractor

* Fix bug

* Add print statement

* Add truncation of bounding boxes

* Add tests for LayoutLMv2FeatureExtractor and LayoutLMv2Tokenizer

* Improve tokenizer tests

* Make more tokenizer tests pass

* Make more tests pass, add integration tests

* Finish integration tests

* More improvements

* More improvements - update API of the tokenizer

* More improvements

* Remove support for VQA training

* Remove some files

* Improve feature extractor

* Improve documentation and one more tokenizer test

* Make quality and small docs improvements

* Add batched tests for LayoutLMv2Processor, remove fast tokenizer

* Add truncation of labels

* Apply suggestions from code review

* Improve processor tests

* Fix failing tests and add suggestion from code review

* Fix tokenizer test

* Add detectron2 CI job

* Simplify CI job

* Comment out non-detectron2 jobs and specify number of processes

* Add pip install torchvision

* Add durations to see which tests are slow

* Fix tokenizer test and make model tests smaller

* Frist draft

* Use setattr

* Possible fix

* Proposal with configuration

* First draft of fast tokenizer

* More improvements

* Enable fast tokenizer tests

* Make more tests pass

* Make more tests pass

* More improvements

* Addd padding to fast tokenizer

* Mkae more tests pass

* Make more tests pass

* Make all tests pass for fast tokenizer

* Make fast tokenizer support overflowing boxes and labels

* Add support for overflowing_labels to slow tokenizer

* Add support for fast tokenizer to the processor

* Update processor tests for both slow and fast tokenizers

* Add head models to model mappings

* Make style & quality

* Remove Detectron2 config file

* Add configurable option to label all subwords

* Fix test

* Skip visual segment embeddings in test

* Use ResNet-18 backbone in tests instead of ResNet-101

* Proposal

* Re-enable all jobs on CI

* Fix installation of tesseract

* Fix failing test

* Fix index table

* Add LayoutXLM doc page, first draft of code examples

* Improve documentation a lot

* Update expected boxes for Tesseract 4.0.0 beta

* Use offsets to create labels instead of checking if they start with ##

* Update expected boxes for Tesseract 4.1.1

* Fix conflict

* Make variable names cleaner, add docstring, add link to notebooks

* Revert "Fix conflict"

This reverts commit a9b46ce9afe47ebfcfe7b45e6a121d49e74ef2c5.

* Revert to make integration test pass

* Apply suggestions from @LysandreJik's review

* Address @patrickvonplaten's comments

* Remove fixtures DocVQA in favor of dataset on the hub

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2021-08-30 12:35:42 +02:00
Patrick von Platen
a75db353c4 [Slow tests] Disable Wav2Vec2 pretraining test for now (#13303)
* fix_torch_device_generate_test

* remove @

* wav2vec2 pretraining

Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-08-30 06:03:02 -04:00
Patrick von Platen
4362ee298a correct (#13304) 2021-08-30 06:02:08 -04:00
Anton Lozhkov
b6f332ecaf Add Wav2Vec2 & Hubert ForSequenceClassification (#13153)
* Add hubert classifier + tests

* Add hubert classifier + tests

* Dummies for all classification tests

* Wav2Vec2 classifier + ER test

* Fix hubert integration tests

* Add hubert IC

* Pass tests for all classification tasks on Hubert

* Pass all tests + copies

* Move models to the SUPERB org
2021-08-27 20:52:51 +03:00
Patrick von Platen
2bef3433e5 [Flax] Correct all return tensors to numpy (#13307)
* fix_torch_device_generate_test

* remove @

* finish find and replace
2021-08-27 17:38:34 +02:00
Nicolas Patry
8aa67fc192 Fixing mbart50 with return_tensors argument too. (#13301)
* Fixing mbart50 with `return_tensors` argument too.

* Adding mbart50 tokenization tests.
2021-08-27 17:22:06 +02:00
Nicolas Patry
b89a964d3f Moving zero-shot-classification pipeline to new testing. (#13299)
* Moving `zero-shot-classification` pipeline to new testing.

* Cleaning up old mixins.

* Fixing tests
`sshleifer/tiny-distilbert-base-uncased-finetuned-sst-2-english` is
corrupted in PT.

* Adding warning.
2021-08-27 15:46:11 +02:00
NielsRogge
cc27ac1a87 Fix BeitForMaskedImageModeling (#13275)
* First pass

* Fix docs of bool_masked_pos

* Add integration script

* Fix docstring

* Add integration test for BeitForMaskedImageModeling

* Remove file

* Fix docs
2021-08-27 09:09:57 -04:00
Nicolas Patry
a3f96f366a Moving translation pipeline to new testing scheme. (#13297)
* Moving `translation` pipeline to new testing scheme.

* Update tokenization mbart tests.
2021-08-27 12:26:17 +02:00
Nicolas Patry
45a8eb66bb Moving token-classification pipeline to new testing. (#13286)
* Moving `token-classification` pipeline to new testing.

* Fix tests.
2021-08-27 11:24:56 +02:00
Nicolas Patry
a6e36558ef Moving text-generation pipeline to new testing framework. (#13285)
* Moving `text-generation` pipeline to new testing framework.

* Keep check_model_type but log instead of raise Exception.

* warning -> error.
2021-08-26 17:30:03 +02:00
Nicolas Patry
662b143b71 Hotfixing master tests. (#13282) 2021-08-26 10:09:53 -04:00
Nicolas Patry
59c378d069 Moving text2text-generation to new pipeline testing mecanism. (#13281) 2021-08-26 16:09:48 +02:00
Nicolas Patry
0ebda5382b Moving table-question-answering pipeline to new testing. (#13280) 2021-08-26 09:09:57 -04:00
Nicolas Patry
879fe8fa75 Moving summarization pipeline to new testing format. (#13279)
* Moving `summarization` pipeline to new testing format.

* Remove generate_kwargs from __init__ args.
2021-08-26 14:47:11 +02:00
Nicolas Patry
55fb88d369 Moving question_answering tests to the new testing scheme. Had to tweak a little some ModelTesterConfig for pipelines. (#13277)
* Moving question_answering tests to the new testing scheme. Had to tweak
a little some ModelTesterConfig for pipelines.

* Removing commented code.
2021-08-26 12:37:55 +02:00
Nicolas Patry
6b586ed18c Move image-classification pipeline to new testing (#13272)
- Enforce `test_small_models_{tf,pt}` methods to exist (enforce checking
actual values in small tests)
- Add support for non RGB image for the pipeline.
2021-08-26 05:52:49 -04:00
Stas Bekman
40d60e1536 fix tokenizer_class_from_name for models with - in the name (#13251)
* fix tokenizer_class_from_name

* Update src/transformers/models/auto/tokenization_auto.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* add test

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-08-26 04:29:14 -04:00
Nicolas Patry
83bfdbdd75 Migrating conversational pipeline tests to new testing format (#13114)
* New test format for conversational.

* Putting back old mixin.

* Re-enabling auto tests with LazyLoading.

* Feature extraction tests.

* Remove feature-extraction.

* Feature extraction with feature_extractor (No pun intended).

* Update check_model_type for fill-mask.
2021-08-26 03:50:43 -04:00
Lysandre Debut
72eefb34a9 Add require flax to test (#13260) 2021-08-25 12:56:25 -04:00
Lysandre Debut
3bbe68f837 Hubert test fix (#13261) 2021-08-25 18:41:26 +02:00
Stas Bekman
5c6eca71a9 fix AutoModel.from_pretrained(..., torch_dtype=...) (#13209)
* fix AutoModel.from_pretrained(..., torch_dtype=...)

* fix to_diff_dict

* add better test

* torch is not always available when a model has self.torch_dtype
2021-08-24 11:43:41 +02:00
Yih-Dar
2e20c0f34a Make Flax GPT2 working with cross attention (#13008)
* make flax gpt2 working with cross attention

* Remove encoder->decoder projection layer

* A draft (incomplete) for FlaxEncoderDecoderModel

* Add the method from_encoder_decoder_pretrained + the docstrings

* Fix the mistakes of using EncoderDecoderModel

* Fix style

* Add FlaxEncoderDecoderModel to the library

* Fix cyclic imports

* Add FlaxEncoderDecoderModel to modeling_flax_auto.py

* Remove question comments

* add tests for FlaxEncoderDecoderModel

* add flax_encoder_decoder to the lists of ignored entries in check_repo.py

* fix missing required positional arguments

* Remove **kwargs when creating FlaxEncoderDecoderModel in from_encoder_decoder_pretrained()

Also fix generation eos/pad tokens issue

* Fix: Use sequences from the generated_output

* Change a check from assert to raise ValueError

* Fix examples and token ids issues

* Fix missing all_cross_attentions when outputting tuple in modeling_gpt2

* Remove the changes in configuration docstrings.

* allow for bert 2 gpt2

* make fix-copies

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Change remaining examples to bert2gpt2

* Change the test to Bert2GPT2

* Fix examples

* Fix import

* Fix unpack bug

* Rename to FlaxEncoderDecoderModelTest and change the test to bert2gpt2

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Fix: NotImplentedError -> NotImplementedError

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* up

* finalize

Co-authored-by: ydshieh <ydshieh@user.noreply>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-08-23 17:57:29 +02:00
SaulLu
7223844df9 Change how "additional_special_tokens" argument in the ".from_pretrained" method of the tokenizer is taken into account (#13056)
* add test

* add change in PretrainedTokenizerBase

* change Luke

* deactivate

* add the possibility to add additional special tokens for M2M100

* format

* add special test for canine

* proposed changes for mbart

* proposed changes for mbart50

* proposed changes for byt5

* proposed changes for canine

* proposed changes for t5

* test fast and slow

* remove comment

* remove comment

* add fast version for all tests

* replace break by continue

* add more comments

* add check to avoid duplicates

* remove comment

* format

* proposed change for wave2vec2

* reverse changes mbart

* uncomment

* format
2021-08-23 14:35:18 +02:00
Philipp Schmid
f689743e74 SageMaker: Fix sagemaker DDP & metric logs (#13181)
* Barrier -> barrier

* added logger for metrics

* removed stream handler in trainer

* moved handler

* removed streamhandler from trainer

* updated test image and instance type added datasets version to test

* Update tests/sagemaker/scripts/pytorch/requirements.txt

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-08-23 10:18:07 +02:00
NielsRogge
8679bd7144 Add min and max question length options to TapasTokenizer (#12803)
* Add min and max question length option to the tokenizer

* Add corresponding test
2021-08-23 03:44:42 -04:00
NielsRogge
588e6caa15 Overwrite get_clean_sequence as this was causing a bottleneck (#13183) 2021-08-23 03:41:35 -04:00
Allan Lin
91ff480e26 Update namespaces inside torch.utils.data to the latest. (#13167)
* Update torch.utils.data namespaces to the latest.

* Format

* Update Dataloader.

* Style
2021-08-19 14:29:51 +02:00
Patrick von Platen
ecfa7eb260 [AutoFeatureExtractor] Fix loading of local folders if config.json exists (#13166)
* up

* up
2021-08-18 16:18:13 +02:00
Ori Ram
439a43b6b4 Add splinter (#12955)
* splinter template

* initialize splinter classes

* Splinter Tokenizer

* splinter.rst

* tokenization fixes

* Documentation & some minor variable name changes

* bug fix (added back question_token_id to config) + variable names

* Minor bug fixes + variable name changes

* Fix Splinter references after merge with new transformers

* changes after running make style & quality

* Fix documentation unindent

* Fix doc indentation in tokenization_splinter

* Fix also SplinterTokenizerFast

* Add Splinter to index.rst and README

* Fixdouble whitespace from index.rst

* Fixed index.rst with 'make fix-copies'

* Update docs/source/model_doc/splinter.rst

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update docs/source/model_doc/splinter.rst

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update docs/source/model_doc/splinter.rst

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update docs/source/model_doc/splinter.rst

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update src/transformers/models/splinter/__init__.py

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Added "copied from BERT" comments

* Removing unnexessary code from modeling_splinter

* Update README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/splinter/configuration_splinter.py

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Remove references to TF modeling from splinter

* Update src/transformers/models/splinter/modeling_splinter.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Remove unnecessary check

* Update src/transformers/models/splinter/modeling_splinter.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add differences between Splinter and Bert tokenizers

* Update src/transformers/models/splinter/modeling_splinter.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/splinter/tokenization_splinter_fast.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Remove unnecessary check

* Doc formatting

* Update src/transformers/models/splinter/tokenization_splinter.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/splinter/tokenization_splinter.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* bug fix: remove load_tf_weights attribute

* Some minor quality changes

* Update docs/source/model_doc/splinter.rst

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/splinter/configuration_splinter.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Change FullyConnectedLayer to SplinterFullyConnectedLayer

* Variable naming

* Reove gather_positions function

* Remove ClassificationHead as it's outdated

* Update src/transformers/models/splinter/modeling_splinter.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Remove hardcoded 102 token id

* Minor style change

* Added "tau" organization to all model identifiers & URLS

* Added tau to the tests as well

* Copy-from comments

* Removed all unnecessary classes (e.g. SplinterForMaskedLM)

* Running make fix-copies

* Bug fix: Further removed unnecessary classes

* Add Splinter to AutoTokenization

* Add an integration test for Splinter

* Removed initialize_new_qass from config - It will be done through different checkpoints

* Removed `initialize_new_qass` from documentation as well

* Added new checkpoint names (`tau/splinter-base-qass` and same for large) in the code

* Minor change to test

* SplinterTokenizer now doesn't abstract from BertTokenizer

* SplinterTokenizerFast also dosn't abstract from Bert

* style and quality

* bug fix: import ing torch in tests only if it's available

* Auto mappings

* Changed copyrights in Splinter's files

* Update src/transformers/models/splinter/configuration_splinter.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

Co-authored-by: yuvalkirstain <kirstain.yuval@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-08-17 08:29:01 -04:00
Nicolas Patry
d58926ab1d Moving fill-mask pipeline to new testing scheme (#12943)
* Fill mask pipelines test updates.

* Model eval !!

* Adding slow test with actual values.

* Making all tests pass (skipping quite a bit.)

* Doc styling.

* Better doc cleanup.

* Making an explicit test with no pad token tokenizer.

* Typo.
2021-08-13 12:04:18 +02:00
Sylvain Gugger
9a498c37a2 Rely on huggingface_hub for common tools (#13100)
* Remove hf_api module and use hugginface_hub

* Style

* Fix to test_fetcher

* Quality
2021-08-12 14:59:02 +02:00
Patrick von Platen
6900dded49 [Flax/JAX] Run jitted tests at every commit (#13090)
* up

* up

* up
2021-08-12 14:49:46 +02:00
Sylvain Gugger
ea8ffe36d3 Proper import for unittest.mock.patch (#13085) 2021-08-12 11:23:00 +02:00
Kamal Raj
d329b63369 Deberta tf (#12972)
* TFDeberta

moved weights to build and fixed name scope

added missing ,

bug fixes to enable graph mode execution

updated setup.py

fixing typo

fix imports

embedding mask fix

added layer names avoid autmatic incremental names

+XSoftmax

cleanup

added names to layer

disable keras_serializable
Distangled attention output shape hidden_size==None
using symbolic inputs

test for Deberta tf

make style

Update src/transformers/models/deberta/modeling_tf_deberta.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Update src/transformers/models/deberta/modeling_tf_deberta.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Update src/transformers/models/deberta/modeling_tf_deberta.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Update src/transformers/models/deberta/modeling_tf_deberta.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Update src/transformers/models/deberta/modeling_tf_deberta.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Update src/transformers/models/deberta/modeling_tf_deberta.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Update src/transformers/models/deberta/modeling_tf_deberta.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

removed tensorflow-probability

removed blank line

* removed tf experimental api
+torch_gather tf implementation from @Rocketknight1

* layername DeBERTa --> deberta

* copyright fix

* added docs for TFDeberta & make style

* layer_name change to fix load from pt model

* layer_name change as pt model

* SequenceClassification layername change,
to same as pt model

* switched to keras built-in LayerNormalization

* added `TFDeberta` prefix most layer classes

* updated to tf.Tensor in the docstring
2021-08-12 05:01:26 -04:00
Sylvain Gugger
0454e4bd8b Fix ModelOutput instantiation form dictionaries (#13067)
* Fix ModelOutput instantiation form dictionaries

* Style
2021-08-10 12:20:04 +02:00
Lysandre Debut
6f5ab9daf1 Add MBART to models exportable with ONNX (#13049)
* Add MBART to models exportable with ONNX

* unittest mock

* Add tests

* Misc fixes
2021-08-09 08:56:04 -04:00
Lysandre Debut
1bf38611a4 Put smaller ALBERT model (#13028) 2021-08-06 12:41:33 -04:00
Michael Benayoun
dc420b0eb1 T5 with past ONNX export (#13014)
T5 with past ONNX export, and more explicit past_key_values inputs and outputs names for ONNX model

Authored-by: Michael Benayoun <michael@huggingface.co>
2021-08-06 15:46:26 +02:00