Commit Graph

1722 Commits

Author SHA1 Message Date
Minh Chien Vu
9c9db751e2 add Bigbird ONNX config (#16427)
* add Bigbird ONNX config
2022-04-12 20:46:06 +02:00
Sanchit Gandhi
a960406722 [FlaxWav2Vec2Model] Fix bug in attention mask (#16725)
* [FlaxWav2Vec2Model] Fix bug in attention mask

* more fixes

* add (Flax)SpeechEncoderDecoderModel PT-FX cross-test
2022-04-12 19:48:24 +02:00
Joao Gante
d7f7f29f29 TF: remove set_tensor_by_indices_to_value (#16729) 2022-04-12 17:51:47 +01:00
Nicolas Patry
a192f61e08 Change the chunk_iter function to handle (#16730)
* Change the chunk_iter function to handle

the subtle cases where the last chunk gets ignored since all the
data is in the `left_strided` data.

We need to remove the right striding on the previous item.

* Remove commented line.
2022-04-12 18:25:02 +02:00
Yih-Dar
dce33f2150 Improve PT/TF equivalence test (#16557)
* add error message

* Use names in the error message

* allow ModelOutput

* rename to check_pt_tf_outputs and move outside

* fix style

* skip past_key_values in a better way

* Add comments

* improve code for label/loss

* make the logic clear by moving the ignore keys out

* fix _postprocessing_to_ignore

* fix _postprocessing_to_ignore: create new outputs from the remaining fields

* ignore past_key_values in TFGPT2 models for now

* make check_pt_tf_outputs better regarding names

* move check_pt_tf_models outside

* rename methods

* remove test_pt_tf_model_equivalence in TFCLIPModelTest

* Reduce TFViTMAEModelTest.test_pt_tf_model_equivalence

* move prepare_pt_inputs_from_tf_inputs outside check_pt_tf_models

* Fix quality

* Clean-up TFLxmertModelTester.test_pt_tf_model_equivalence

* Fix quality

* fix

* fix style

* Clean-up TFLEDModelTest.test_pt_tf_model_equivalence

* Fix quality

* add docstring

* improve comment

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-11 22:19:12 +02:00
Yih-Dar
c04619ecf3 Enable more test_torchscript (#16679)
* update _create_and_check_torchscript

* Enable test_torchscript

* clear_class_registry

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-11 18:23:35 +02:00
Yih-Dar
3918d6a9d6 Reduce memory leak in _create_and_check_torchscript (#16691)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-11 18:22:28 +02:00
Yih-Dar
2109afae71 Rename the method test_torchscript (#16693)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-11 18:21:45 +02:00
Joao Gante
b0bf3011c1 Generate: min length can't be larger than max length (#16668)
* min length must be smaller than max length

* Update min_length in tests
2022-04-11 11:55:30 +01:00
SaulLu
1025a9b742 add a warning in SpmConverter for sentencepiece's model using the byte fallback feature (#16629)
* update proto sentencepiece model

* Revert "update proto sentencepiece model"

This reverts commit b07f671747fec35773d0b3d4788b8b15aefa0229.

* add check

* add test

* Revert "Revert "update proto sentencepiece model""

This reverts commit 46108257b8927b73627ec8f4f3eed53a95fc700d.

* test for log level

* test for log level 2

* warning at the warning level

* clean

* format

* add explanation in docstring
2022-04-11 11:06:10 +02:00
NielsRogge
4ef0abb738 Add TAPEX (#16473)
* Add TapexTokenizer

* Improve docstrings and provide option to provide answer

* Remove option for pretokenized inputs

* Add TAPEX to README

* Fix copies

* Remove option for pretokenized inputs

* Initial commit: add tapex fine-tuning examples on both table-based question answering and table-based fact verification.

* - Draft a README file for running the script and introducing some background.
- Remove unused code lines in tabfact script.
- Disable the deafult `pad_to_max_length` option which is memory-consuming.

* * Support `as_target_tokenizer` function for TapexTokenizer.
* Fix the do_lower_case behaviour of TapexTokenizer.
* Add unit tests for target scenarios and cased/uncased scenarios for both source and target.

* * Replace the label BartTokenizer with TapexTokenizer's as_target_tokenizer function.
* Fix typos in tapex example README.

* * fix the evaluation script - remove the property `task_name`

* * Make the label space more clear for tabfact tasks

* * Using a new fine-tuning script for tapex-base on tabfact.

* * Remove the lowercase code outside the tokenizer - we use the tokenizer to control whether do_lower_case
* Guarantee the hyper-parameter can be run without out-of-memory on 16GB card and report the new reproduced number on wikisql

* * Remove the default tokenizer_name option.
* Provide evaluation command.

* * Support for WikiTableQuestion dataset.

* Fix a typo in README.

* * Fix the datasets's key name in WikiTableQuestions

* Run make fixup and move test to folder

* Fix quality

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply some more suggestions from code review

* Improve docstrings

* Overwrite failing test

* Improve comment in example scripts

* Fix rebase

* Add TAPEX to Auto mapping

* Add TAPEX to auto config mappings

* Put TAPEX higher than BART in auto mapping

* Add TAPEX to doc tests

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>
Co-authored-by: SivilTaram <qianlxc@outlook.com>
Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-04-08 10:57:51 +02:00
Francesco Saverio Zuppichini
af14c61973 RegNet (#16188)
* base model done

* make style

* done

* added files

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Trigger doc build

* resolved conversations

* resolved conversations

* seer models

* minor changes

* minor changes

* make fixup

* glob variables

* minor changes

* fix copies

* config when possibile

* resolved conflicts

* resolved conflicts

* resolved conflicts

* CI

* conversion script for 10b param

* fixed for 10b model

* minor updates in the doc + make style

* removed unused code

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* removed unused code

* removed unused code

* updated modeling_utils from main

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
2022-04-07 21:58:00 +02:00
Sylvain Gugger
389f66151d Remove parent/child tests in auto model tests (#16653) 2022-04-07 11:05:10 -04:00
Joao Gante
3f43d824b9 TF generate refactor - Beam Search (#16374)
* refactor TF beam search

* refactored generate can now properly use attention masks

* add force bos/eos logit processors
2022-04-06 18:19:34 +01:00
Sanchit Gandhi
8d57c424e0 [FlaxSpeechEncoderDecoderModel] More Rigorous PT-Flax Equivalence Tests (#16589) 2022-04-06 15:33:32 +02:00
Matt
4354005291 Adding new train_step logic to make things less confusing for users (#15994)
* Adding new train_step logic to make things less confusing for users

* DO NOT ASK WHY WE NEED THAT SUBCLASS

* Metrics now working, at least for single-output models with type annotations!

* Updates and TODOs for the new train_step

* Make fixup

* Temporary test workaround until T5 has types

* Temporary test workaround until T5 has types

* I think this actually works! Needs a lot of tests though

* MAke style/quality

* Revert changes to T5 tests

* Deleting the aforementioned unmentionable subclass

* Deleting the aforementioned unmentionable subclass

* Adding a Keras API test

* Style fixes

* Removing unneeded TODO and comments

* Update test_step too

* Stop trying to compute metrics with the dummy_loss, patch up test

* Make style

* make fixup

* Docstring cleanup

* make fixup

* make fixup

* Stop expanding 1D input tensors when using dummy loss

* Adjust T5 test given the new compile()

* make fixup

* Skipping test for convnext

* Removing old T5-specific Keras test now that we have a common one

* make fixup

* make fixup

* Only skip convnext test on CPU

* Update src/transformers/modeling_tf_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Avoiding TF import issues

* make fixup

* Update compile() to support TF 2.3

* Skipping model.fit() on template classes for now

* Skipping model.fit() on template class tests for now

* Replace ad-hoc solution with find_labels

* make fixup

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-05 14:23:27 +01:00
Yih-Dar
765bafb8e4 Fix CI: test_inference_for_pretraining in ViTMAEModelTest (#16591)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-05 10:00:03 +02:00
Joao Gante
dad5ca83b2 TF: Finalize unpack_inputs-related changes (#16499)
* Add unpack_inputs to remaining models

* removed kwargs to `call()` in TF models

* fix TF T5 tests
2022-04-04 16:37:33 +01:00
SaulLu
be9474bd35 add a test checking the format of convert_tokens_to_string's output (#16540)
* add new tests

* add comment to overridden tests
2022-04-04 16:57:24 +02:00
Sylvain Gugger
3951b9f390 Add utility to find model labels (#16526)
* Add utility to find model labels

* Use it in the Trainer

* Update src/transformers/utils/generic.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Quality

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2022-04-04 10:06:57 -04:00
Yih-Dar
2199382dfd Use random_attention_mask for TF tests (#16517)
* use random_attention_mask for TF tests

* Fix for TFCLIP test (for now).

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-01 16:53:07 +02:00
Jim Rohrer
9de70f213e Add ONNX export for BeiT (#16498)
* Add beit onnx conversion support

* Updated docs

* Added cross reference to ViT ONNX config
2022-04-01 10:52:42 +02:00
Francesco Saverio Zuppichini
c4deb7b3ae Feature Extractor accepts segmentation_maps (#15964)
* feature extractor accepts

* resolved conversations

* added examples in test for ADE20K

* num_classes -> num_labels

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* resolving conversations

* resolving conversations

* removed ADE

* CI

* minor changes in conversion script

* reduce_labels in feature extractor

* minor changes

* correct preprocess for instace segmentation maps

* minor changes

* minor changes

* CI

* debugging

* better padding

* going to update labels inside the model

* going to update labels inside the model

* minor changes

* tests

* removed changes in feature_extractor_utils

* conversation

* conversation

* example in feature extractor

* more docstring in modeling

* test

* make style

* doc

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-30 18:46:51 +02:00
Yih-Dar
2b483230a1 Raise diff tolerance value for TFViTMAEModelTest (#16483)
* Raise diff tolerance value

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-29 22:12:27 +02:00
Sander Land
d7c8ce57d4 Avoid accessing .dataset of a DataLoader in Trainer (#16451)
* Avoid accessing .dataset of a dataloader

* style

* fix

* cleaning up, reverting some misunderstandings

* black

* add train_dataset argument to get_train_dataloader, and fix other instances of length checks

* flake8

* address comments

* fix bug

* cleanup

* add test

* Update tests/trainer/test_trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* under torch

* merge

* stylistic suggestion

Co-authored-by: Sander Land <sander@chatdesk.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-29 15:00:18 -04:00
Sayak Paul
5b40a37bc4 Add TF ViT MAE (#16255)
* ported TFViTMAEIntermediate and TFViTMAEOutput.

* added TFViTMAEModel and TFViTMAEDecoder.

* feat: added a noise argument in the implementation for reproducibility.

* feat: vit mae models with an additional noise argument for reproducibility.

Co-authored-by: ariG23498 <aritra.born2fly@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-29 18:24:15 +01:00
Joao Gante
7a9ef8181c TF: properly handle kwargs in encoder_decoder architectures (#16465)
* properly handle kwargs in encoder_decoder architectures

* make fixup
2022-03-29 18:17:47 +01:00
Yih-Dar
86cff21cf6 Fix some TF GPT-J CI testings (#16454)
* Fix for test_mixed_precision

* Fix test_saved_model_creation by using shape_list instead of shape

* skit test_model_from_pretrained on GPU for now to avoid GPU OOM

* skip test_gptj_sample_max_time for now

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-29 18:04:20 +02:00
Yih-Dar
aebca696af Fix missing output_attentions in PT/Flax equivalence test (#16271)
* fix - set output_attentions to True

* Update tests/test_modeling_flax_common.py

* update for has_attentions

* overwrite check_outputs in FlaxBigBirdModelTest

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
2022-03-29 17:51:48 +02:00
NielsRogge
979b039c89 Add DPT (#15991)
* First draft

* More improvements

* Add fusion blocks

* Make conversion script work for dpt_large

* Make conversion script work

* Improve implementation

* Improve conversion script

* Add DPTForSemanticSegmentation

* Make conversion work for semantic segmentation

* Add tests

* Remove print statements

* First draft

* Redesign neck

* Improve tests

* Improve implementation some more

* Make neck output list of tensors

* Improve neck and feature extractor

* Fix integration tests

* Make more tests pass

* Make all tests pass

* Add missing config archive map

* Add in_index attribute to make heads accept list of tensors

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply some more suggestions

* Add copied from statements

* Remove assert

* Apply suggestions from code review

* Apply suggestions from code review

* Remove DPTInterpolate in favor of nn.Upsample

* Add comments

* Apply suggestions from code review

* Apply suggestions from code review

* Add proposed design

* Update design

* Add DPTReassembleLayer

* Add DPTFeatureFusionStage

* Apply more suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

* Fix rebase

* Update in_index and out_indices

* Fix conversion script

* Fix code quality

* Add model to toctree and use DepthEstimatorOutput

* Fix rebase

* Fix code examples

* Improve code

* Fix copied from statements

* Apply suggestions from code review

* Remove compute_loss method

* Apply suggestions from code review

* Fix documentation tests file

* Remove test.py file

* Improve doc example

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>
2022-03-28 16:28:10 +02:00
Sanchit Gandhi
7ca4633555 [FlaxSpeechEncoderDecoderModel] Ensure Input and Output Word Embeddings Are **Not** Tied (#16444)
* [FlaxSpeechEncoderDecoderModel] Ensure Input and Output Word Embeddings Are **Not** Tied

* rebase
2022-03-28 14:14:10 +02:00
Jaesun Park
e0ac72b7bd Fix PerceiverMLP and test (#16405)
Co-authored-by: Jaesun Park <jaesun.park1@navercorp.com>
2022-03-28 14:06:48 +02:00
Sanchit Gandhi
925fc57b70 [Flax] Improve Robustness of Back-Prop Tests (#16418)
* [Flax] Improve Robustness of Back-Prop Tests

* check equality of logits/outputs

* make fixup
2022-03-28 11:56:54 +02:00
Daniel Stancl
ed2ee373d0 Add TF implementation of GPT-J (#15623)
* Initial commit

* Add TFGPTJModel

* Fix a forward pass

* Add TFGPTJCausalLM

* Add TFGPTJForSequenceClassification

* Add TFGPTJForQuestionAnswering

* Fix docs

* Deal with TF dynamic shapes

* Add Loss parents to models

* Adjust split and merge heads to handle 4 and 5-dim tensors

* Update outputs for @tooslow tests
2022-03-25 19:27:19 +00:00
Sanchit Gandhi
e231c72906 [FlaxSpeechEncoderDecoder] Fix feature extractor gradient test (#16407) 2022-03-25 17:46:53 +01:00
lewtun
a97f3150c4 Add ONNX support for Blenderbot and BlenderbotSmall (#15875)
* Add ONNX support for Blenderbot

* Add BlenderbotSmall ONNX configuration

* Update serialization table
2022-03-25 17:04:43 +01:00
Sylvain Gugger
b473617d63 Checkpoint sharding (#16343)
* Sharded checkpoint support

* Handle distant sharded checkpoints

* Add tests

* TODO is done

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Fix docstring

* Add example and format

* Address review comments

* More review comments

* End of merge

* Revert unintentional change

* VsCode what did you do?

* Style

* Changes

* Address final comments

* Quality

* Moar tests

* Move import beneath is_pt_available

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2022-03-25 11:59:25 -04:00
Sylvain Gugger
cae394c8fa Adapt import to new structure 2022-03-24 14:40:05 -04:00
Yih-Dar
f571dc20ac Update PT Flax equivalence tests in PT test file (#16280)
* update PT/Flax equivalence tests on PT side

* overwrite check_outputs in BigBirdModelTest

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-24 14:45:30 +01:00
Yih-Dar
2a27c80063 Fix BigBirdModelTester (#16310)
* fix

* update the expected value in test_fast_integration

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-24 13:43:52 +01:00
Edward Beeching
aff9bc405a Decision transformer gym (#15845)
* Created the Decision Transformer Modle

* updating tests, copy to other machine

* Added last hidden size to Decision Transformer modelling outputs

* Removed copy of original DT file

* made a temporary change to gpt2 to have it conform with the Decision Transformer version

* Updated tests

* Ignoring a file used to test the DT model

* added comments to config file

* added comments and argument descriptions to decision transformer file

* Updated doc

* Ran "make style"

* Remove old model imports

* Removed unused imports, cleaned up init file

* Update docs/source/model_doc/decision_transformer.mdx

added my username

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Reverted changes made to gpt2

* Removed datasets submodule

* Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states

* Added support for return of hidden states, attentions and return dict of gpt2 model.

* Updated tests to include many of the ModelTesterMixin tests. 

The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes

* Added missing line to the end of gpt2 file

* Added an integration test for the Decision Transformer

Test performs and autoregressive evaluation for two time steps

* Set done and info to _ to fix failing test

* Updated integration test to be deterministic and check expected outputs

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Removed unnecessary config options

* Cleaned up commented code and old comments.

* Cleaned up commented code.

* Changed DecisionTransformer to Decision Transformer

* Added Decision Transformer to the main README file

* Added copy of GTP2 called DecisionTranformerGPT2Model

* isorted imports

* isorted imports

* Added model to non-English README files

* Ran make fix-copies and corrected some cases.

* Updated index file to include Decision Transformer

* Added gpt2 model as copy inside the Decision Transformer model file

* Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS

* Deleted redundant checkpoint files (I don't know how these got committed)

* Removed testing files. (These should have never been committed)

* Removed accidentally committed files

* Moved the Decision Transformer test to its own directory

* Add type hints for Pegasus (#16324)

* Funnel type hints (#16323)

* add pt funnel type hints

* add tf funnel type hints

* Add type hints for ProphetNet PyTorch (#16272)

* [GLPN] Improve docs (#16331)

* Add link to notebook

* Add link

* Fix bug

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

* Added type hints for Pytorch Marian calls (#16200)

* Added type hinting for forward functions in pytorch marian

* typo correction

* Removed type hints on functions from BART per Suraj Patil request

* fix import pb

* fix typo

* corrected tuple call

* ran black

* after fix-copies
Some optional tags on primitives were removed, past_key_values in MarianForCausalLM changed from Tuple of Tuple to List

* Fixing copies to roformer and pegasus

Co-authored-by: Clementine Fourrier <cfourrie@inria.fr>
Co-authored-by: matt <rocketknight1@gmail.com>

* Moved DecisionTransformOutput to modeling_decision_transformer

* Moved the example usage to research project and cleaned comments

* Made tests ignore the copy of gpt2 in Decision Transformer

* Added module output to modelling decision transformer

* removed copied gpt2 model from list of transformers models

* Updated tests and created __init__ file for new test location

* Update README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Removed unneeded summary type from config file

* Fixed copies

* Updated pretrained config map to refer to hopper-medium checkpoint

* done (#16340)

* Added Decision transformer to model docs

* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add type annotations for Rembert/Splinter and copies (#16338)

* undo black autoformat

* minor fix to rembert forward with default

* make fix-copies, make quality

* Adding types to template model

* Removing List from the template types

* Remove `Optional` from a couple of types that don't accept `None`

Co-authored-by: matt <rocketknight1@gmail.com>

* [Bug template] Shift responsibilities for long-range (#16344)

* Fix code repetition in serialization guide (#16346)

* Adopt framework-specific blocks for content (#16342)

*  refactor code samples with framework-specific blocks

*  update training.mdx

* 🖍 apply feedback

* Updates the default branch from master to main (#16326)

* Updates the default branch from master to main

* Links from `master` to `main`

* Typo

* Update examples/flax/README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Updated model with custom docstring example

* Created the Decision Transformer Modle

* updating tests, copy to other machine

* Added last hidden size to Decision Transformer modelling outputs

* Removed copy of original DT file

* made a temporary change to gpt2 to have it conform with the Decision Transformer version

* Updated tests

* Ignoring a file used to test the DT model

* added comments to config file

* added comments and argument descriptions to decision transformer file

* Updated doc

* Ran "make style"

* Remove old model imports

* Removed unused imports, cleaned up init file

* Update docs/source/model_doc/decision_transformer.mdx

added my username

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Reverted changes made to gpt2

* Removed datasets submodule

* Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states

* Added support for return of hidden states, attentions and return dict of gpt2 model.

* Updated tests to include many of the ModelTesterMixin tests. 

The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes

* Added missing line to the end of gpt2 file

* Added an integration test for the Decision Transformer

Test performs and autoregressive evaluation for two time steps

* Set done and info to _ to fix failing test

* Updated integration test to be deterministic and check expected outputs

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Removed unnecessary config options

* Cleaned up commented code and old comments.

* Cleaned up commented code.

* Changed DecisionTransformer to Decision Transformer

* Added Decision Transformer to the main README file

* Added copy of GTP2 called DecisionTranformerGPT2Model

* isorted imports

* isorted imports

* Added model to non-English README files

* Ran make fix-copies and corrected some cases.

* Updated index file to include Decision Transformer

* Added gpt2 model as copy inside the Decision Transformer model file

* Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS

* Deleted redundant checkpoint files (I don't know how these got committed)

* Removed testing files. (These should have never been committed)

* Removed accidentally committed files

* Moved the Decision Transformer test to its own directory

* Moved DecisionTransformOutput to modeling_decision_transformer

* Moved the example usage to research project and cleaned comments

* Made tests ignore the copy of gpt2 in Decision Transformer

* Added module output to modelling decision transformer

* removed copied gpt2 model from list of transformers models

* Updated tests and created __init__ file for new test location

* Update README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Removed unneeded summary type from config file

* Fixed copies

* Updated pretrained config map to refer to hopper-medium checkpoint

* Added Decision transformer to model docs

* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Updated model with custom docstring example

* Updated copies, config auto, and readme files.

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Dan Tegzes <48134725+Tegzes@users.noreply.github.com>
Co-authored-by: Adam Montgomerie <adam@avanssion.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com>
Co-authored-by: Clementine Fourrier <cfourrie@inria.fr>
Co-authored-by: matt <rocketknight1@gmail.com>
Co-authored-by: Francesco Saverio Zuppichini <francesco.zuppichini@gmail.com>
Co-authored-by: Jacob Dineen <54680234+jacobdineen@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2022-03-23 16:18:43 -04:00
Sylvain Gugger
c595b6e6a9 Make Transformers use cache files when hf.co is down (#16362)
* Make Transformers use cache files when hf.co is down

* Fix tests

* Was there a random circleCI failure?

* Isolate patches

* Style

* Comment out the failure since it doesn't fail anymore

* Better comment
2022-03-23 15:56:49 -04:00
Joao Gante
9e8c37dc82 TF - Fix interchangeable past/past_key_values and revert output variable name in GPT2 (#16332)
* revert tf gpt2

* add test for unpack_inputs and fix test case

* add changes to vision encoder decoder
2022-03-23 18:41:18 +00:00
Sylvain Gugger
4975002df5 Reorganize file utils (#16264)
* Split file_utils in several submodules

* Fixes

* Add back more objects

* More fixes

* Who exactly decided to import that from there?

* Second suggestion to code with code review

* Revert wront move

* Fix imports

* Adapt all imports

* Adapt all imports everywhere

* Revert this import, will fix in a separate commit
2022-03-23 10:26:33 -04:00
Lysandre Debut
eca77f4719 Updates the default branch from master to main (#16326)
* Updates the default branch from master to main

* Links from `master` to `main`

* Typo

* Update examples/flax/README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-23 03:46:59 -04:00
NielsRogge
0c55d47cde Add GLPN (#16199)
* First draft

* Fix logits calculation

* Improve tests

* Add copied from statements

* Fix base_model_prefix

* Improve implementation, upload new models

* Update design

* Fix integration test

* Add model to README and toctree

* Add document image

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add decoder_hidden_size attribute

* Update design of decoder

* Add DepthEstimatorOutput class

* Rename in_index to head_in_index and add feature extractor tests

* Apply suggestions from code review

* Apply suggestions from code review

* Update pretrained model name and add to doc tests

* Remove test.py script

* Update copied from statements and clean up

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-22 08:51:13 +01:00
Yih-Dar
f466936476 Add has_attentions to TFModelTesterMixin as done on PyTorch side (#16259)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-19 11:44:17 +01:00
Yih-Dar
75c666b4a8 Aggressive PT/TF equivalence test on PT side (#16250)
* Aggressive PT/TF equivalence test on PT side

* Ugly fix for `TFTapasForQuestionAnswering`

* apply review suggestions

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-18 18:51:24 +01:00
Yih-Dar
d481b6414d Make Flax pt-flax equivalence test more aggressive (#15841)
* Make test_equivalence_pt_to_flax more aggressive

* Make test_equivalence_flax_to_pt more aggressive

* don't use to_tuple

* clean-up

* fix missing test cases + testing on GPU

* fix conversion

* fix `ValueError: assignment destination is read-only`

* Add type checking

* commit to revert later

* Fix

* fix

* fix device

* better naming

* clean-up

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-18 18:15:36 +01:00
Suraj Patil
b25b92ac4f update jax version and re-enable some tests (#16254) 2022-03-18 16:45:39 +01:00