Commit Graph

1995 Commits

Author SHA1 Message Date
Yih-Dar
280db2e39c Fix test_dbmdz_english by updating expected values (#18482)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-08-05 16:49:54 +02:00
Sylvain Gugger
5cd4032368 Use new huggingface_hub tools for download models (#18438)
* Draft new cached_file

* Initial draft for config and model

* Small fixes

* Fix first batch of tests

* Look in cache when internet is down

* Fix last tests

* Bad black, not fixing all quality errors

* Make diff less

* Implement change for TF and Flax models

* Add tokenizer and feature extractor

* For compatibility with main

* Add utils to move the cache and auto-do it at first use.

* Quality

* Deal with empty commit shas

* Deal with empty etag

* Address review comments
2022-08-05 10:12:40 -04:00
Sylvain Gugger
70fa1a8d26 Fix pipeline tests (#18487)
* Fix pipeline tests

* Make sure all pipelines tests run with init changes
2022-08-05 09:14:51 -04:00
Yih-Dar
893122f666 Add TF prefix to TF-Res test class (#18481)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-08-05 13:59:55 +02:00
Nicolas Patry
586dcf6b21 Fixing issue where generic model types wouldn't load properly with the pipeline (#18392)
* Adding a better error message when the model is improperly configured

within transformers.

* Update src/transformers/pipelines/__init__.py

* Black version.

* Overriding task aliases so that tokenizer+feature_extractor

values are correct.

* Fixing task aliases by overriding their names early

* X.

* Fixing feature-extraction.

* black again.

* Normalizing `translation` too.

* Fixing last few corner cases.

translation need to use its non normalized name (translation_XX_to_YY,
so that the task_specific_params are correctly overloaded).
This can be removed and cleaned up in a later PR.

`speech-encode-decoder` actually REQUIRES to pass a `tokenizer` manually
so the error needs to be discarded when the `tokenizer` is already
there.

* doc-builder fix.

* Fixing the real issue.

* Removing dead code.

* Do not import the actual config classes.
2022-08-05 08:45:07 +02:00
Yih-Dar
14928921e2 Add TF_MODEL_FOR_SEMANTIC_SEGMENTATION_MAPPING (#18469)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-08-04 20:41:15 +02:00
NielsRogge
f9a0008d2d Add VideoMAE (#17821)
* First draft

* Add VideoMAEForVideoClassification

* Improve conversion script

* Add VideoMAEForPreTraining

* Add VideoMAEFeatureExtractor

* Improve VideoMAEFeatureExtractor

* Improve docs

* Add first draft of model tests

* Improve VideoMAEForPreTraining

* Fix base_model_prefix

* Make model take pixel_values of shape (B, T, C, H, W)

* Add loss computation of VideoMAEForPreTraining

* Improve tests

* Improve model testsé

* Make all tests pass

* Add VideoMAE to main README

* Add tests for VideoMAEFeatureExtractor

* Add integration test

* Improve conversion script

* Rename patch embedding class

* Remove VideoMAELayer from init

* Update design of patch embeddings

* Improve comments

* Improve conversion script

* Improve conversion script

* Add conversion of pretrained model

* Add loss verification of pretrained model

* Add loss verification of unnormalized targets

* Add integration test for pretraining model

* Apply suggestions from code review

* Fix bug to make feature extractor resize only shorter edge

* Address more comments

* Improve normalization of videos

* Add doc examples

* Move constants to dedicated script

* Remove scripts

* Transfer checkpoints, fix docs

* Update script

* Update image mean and std

* Fix doc tests

* Set return_tensors to NumPy by default

* Revert the previous change

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-08-04 18:02:55 +02:00
nlpcat
fc1d841b2d change shape to support dynamic batch input in tf.function XLA generate for tf serving (#18372)
* change shape to support dynamic batch input in tf.generate

* add tests

Co-authored-by: nlpcatcode <nlpcodecat@gmail.com>
2022-08-04 11:26:11 +01:00
Sayak Paul
be41eaf55f fix: keras fit tests for segformer tf and minor refactors. (#18412)
* fix: keras fit tests for segformer tf and minor refactors.

* refactor: test_keras_fit to make it simpler using the existing one.

* fix: styling issues.
2022-08-03 16:39:54 +01:00
Daniel Suess
8fb7c908c8 Fix failing tests for XLA generation in TF (#18298)
* Fix failing test_xla_generate_slow tests

* Fix failing speech-to-text xla_generate tests
2022-08-03 09:45:15 -04:00
Gary Miguel
9d7b70bcd7 support ONNX export of XDropout in deberta{,_v2} and sew_d (#17502)
* support ONNX export of XDropout in deberta{,_v2}

* black

* copy to sew_d

* add test

* isort

* use pytest.mark.filterwarnings

* review comments
2022-08-03 06:33:44 -04:00
David
042f420364 Update pipeline word heuristic to work with whitespace in token offsets (#18402)
* Update pipeline word heuristic to work with whitespace in token offsets

This change checks for whitespace in the input string at either the
character preceding the token or in the first character of the token.
This works with tokenizers that return offsets excluding whitespace
between words or with offsets including whitespace.

fixes #18111

starting

* Use smaller model, ensure expected tokenization

* Re-run CI (please squash)
2022-08-02 15:31:01 -04:00
Sylvain Gugger
01db72abd4 Rewrite push_to_hub to use upload_files (#18366)
* Rewrite push_to_hub to use upload_files

* Adapt the doc a bit

* Address review comments and clean doc
2022-08-01 12:07:30 -04:00
Ikuya Yamada
62098b9348 Adding fine-tuning models to LUKE (#18353)
* add LUKE models for downstream tasks

* add new LUKE models to docs

* fix typos

* remove commented lines

* exclude None items from tuple return values
2022-08-01 11:09:47 -04:00
Michael Benayoun
4e2f4a92dd [FX] Symbolic trace for Bloom (#18356)
* Bloom model can now be traced

* Bloom traced model can be torch scripted and serialized

* Bloom can be traced with variable keyword arguments

* Enable XLNet support

* Disable XLNet for now
2022-07-29 16:12:27 +02:00
Sylvain Gugger
986526a0e4 Replace as_target context managers by direct calls (#18325)
* Preliminary work on tokenizers

* Quality + fix tests

* Treat processors

* Fix pad

* Remove all uses of  in tests, docs and examples

* Replace all as_target_tokenizer

* Fix tests

* Fix quality

* Update examples/flax/image-captioning/run_image_captioning_flax.py

Co-authored-by: amyeroberts <amy@huggingface.co>

* Style

Co-authored-by: amyeroberts <amy@huggingface.co>
2022-07-29 08:09:09 -04:00
Ritik Nandwal
e87ac9d18b Add swin transformer v2 (#17469)
* Add files generated using transformer-cli add-new-model-like command

* Add changes for swinv2 attention and forward method

* Add fixes

* Add modifications for weight conversion and remaining args in swin model

* Add changes for patchmerging

* Add changes for SwinV2selfattention

* Update conversion script

* Add final fixes for the swin_v2 model

* Add changes for conversion script for pretrained window size case

* Add pretrained window size value from config in SwinV2Encoder class

* Make fixup

* Add swinv2 to models_not_in_readme to utils/check_copies.py

* Modify Swinv2v2 to Swin Transformer V2

* Remove copied from, to run make fixup command

* Add updates to swinv2tf from main branch

* Add pretrained_window_size to config, to make tests pass

* Add modified weights from nandwalritik profile for swinv2

* Update model weights from swinv2 from nandwalritik profile

* Add fix for build_pr_documentation CI fix

* Add fixes for weight conversion

* Add change to make input with padding work

* Add fixes for test cases

* Add few changes from swin to swinv2 to pass test cases

* Remove tests for tensorflow as swinv2 for TF is not added yet

* Overide test_pt_tf_model_equivalence function as TF implementation for swinv2 is not added yet

* Add modeling_tf_swinv2 to _ignore_modules as test file is removed for this one right now.

* Update docs url for swinv2 in README.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Undo changes for check_repo

* Update url in readme.md

* Remove overrided function to test pt_tf_model_equivalence

* Remove TF model imports for Swinv2 as its not implemented in this PR

* Add changes for index.mdx

* Add swinv2 papers link,abstract and contributors details

* Rename cpb_mlp to continous_position_bias_mlp

* Add tips for swinv2 model

* Update src/transformers/models/swinv2/configuration_swinv2.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/swinv2/configuration_swinv2.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Fix indentation for docstring example in src/transformers/models/swinv2/configuration_swinv2.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update import order in src/transformers/models/swinv2/configuration_swinv2.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add copyright statements in weights conversion script.

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Remove Swinv2 from models_not_in_readme

* Reformat code

* Remove TF implementation file for swinv2

* Update start docstring.

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add changes for docstring

* Update orgname for weights to microsoft

* Remove to_2tuple function

* Add copied from statements wherever applicable

* Add copied from to Swinv2ForMaskedImageModelling class

* Reformat code.

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add unittest.skip(with reason.) for test_inputs_embeds test case.

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add updates for test_modeling_swinv2.py

* Add @unittest.skip() annotation for clarity to create_and_test_config_common_properties function

* Add continuous_position_bias_mlp parameter to conversion script

* Add test for testing masked_image_modelling for swinv2

* Update Swinv2 to Swin Transformer v2 in docs/source/en/model_doc/swinv2.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update Swinv2 to Swin Transformer v2 in docs/source/en/model_doc/swinv2.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/swinv2.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/swinv2.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add suggested changes

* Add copied from to forward methods of Swinv2Stage and Swinv2Encoder

* Add push_to_hub flag to weight conversion script

* Change order or Swinv2DropPath class

* Add id2label mapping for imagenet 21k

* Add updated url for SwinV2 functions and classes used in implementation

* Update input_feature dimensions format, mentioned in comments.

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

* Add suggested changes for modeling_swin2.py

* Update docs

* Remove create_and_test_config_common_properties function, as test_model_common_attributes is sufficient.

* Fix indentation.

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add changes for making Nit objects in code style

* Add suggested changes

* Add suggested changes for test_modelling_swinv2

* make fix-copies

* Update docs/source/en/model_doc/swinv2.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-27 11:14:47 -04:00
Alara Dirik
9caf68a638 Owlvit test fixes (#18303)
* fix owlvit test assertion errors

* fix gpu test error

* remove redundant lines

* fix styling
2022-07-27 17:26:27 +03:00
Mikkel Denker
70e7d1d656 Fixes torch jit tracing for LayoutLMv2 model (re-open) (#18313)
* Fixes torch jit tracing for LayoutLMv2 model.
Pytorch seems to reuse memory for input_shape which caused a mismatch in shapes later in the forward pass.

* Fixed code quality

* avoid unneeded allocation of vector for shape
2022-07-27 06:38:40 -04:00
gilad19
2b09650885 Add ViltForTokenClassification e.g. for Named-Entity-Recognition (NER) (#17924)
* Add ViltForTokenClassification e.g. for Named-Entity-Recognition (NER)

* Add ViltForTokenClassification e.g. for Named-Entity-Recognition (NER)

* provide classifier only text hidden states

* add test_for_token_classification

* Update src/transformers/models/vilt/modeling_vilt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/vilt/modeling_vilt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/vilt/modeling_vilt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/vilt/modeling_vilt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* add test_for_token_classification

Co-authored-by: gfuchs <gfuchs@ebay.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2022-07-26 10:11:32 +02:00
Joao Gante
7e44226fc7 Generate: deprecate default max_length (#18018) 2022-07-23 18:02:03 +01:00
amyeroberts
8e8384663d Update serving code to enable saved_model=True (#18153)
* Add serving_output and serving methods to some vision models

* Add serving outputs for DeiT

* Don't convert hidden states - differing shapes

* Make saveable

* Fix up

* Make swin saveable

* Add in tests

* Fix funnel tests (can't convert to tensor)

* Fix numpy call

* Tidy up a bit

* Add in hidden states - resnet

* Remove numpy

* Fix failing tests - tensor shape and skipping tests

* Remove duplicated function

* PR comments - formatting and var names

* PR comments
Add suggestions made by Joao Gante:
* Use tf.shape instead of shape_list
* Use @tooslow decorator on tests
* Simplify some of the logic

* PR comments
Address Yih-Dar Sheih comments - making tensor names consistent and make types float

* Types consistent with docs; disable test on swin (slow)

* CI trigger

* Change input_features to float32

* Add serving_output for segformer

* Fixup

Co-authored-by: Amy Roberts <amyeroberts@users.noreply.github.com>
2022-07-22 18:05:38 +01:00
Joao Gante
1fc4b2a132 TF: use the correct config with (...)EncoderDecoder models (#18097) 2022-07-22 13:31:45 +01:00
Alara Dirik
12d66b4701 Add OWL-ViT model for zero-shot object detection (#17938)
* add owlvit model skeleton

* add class and box predictor heads

* convert modified flax clip to pytorch

* fix box and class predictors

* add OwlViTImageTextEmbedder

* convert class and box head checkpoints

* convert image text embedder checkpoints

* add object detection head

* fix bugs

* update conversion script

* update conversion script

* fix q,v,k,out weight conversion conversion

* add owlvit object detection output

* fix bug in image embedder

* fix bugs in text embedder

* fix positional embeddings

* fix bug in inference mode vision pooling

* update docs, init tokenizer and processor files

* support batch processing

* add OwlViTProcessor

* remove merge conflicts

* readd owlvit imports

* fix bug in OwlViTProcessor imports

* fix bugs in processor

* update docs

* fix bugs in processor

* update owlvit docs

* add OwlViTFeatureExtractor

* style changes, add postprocess method to feature extractor

* add feature extractor and processor tests

* add object detection tests

* update conversion script

* update config paths

* update config paths

* fix configuration paths and bugs

* fix bugs in OwlViT tests

* add import checks to processor

* fix docs and minor issues

* fix docs and minor issues

* fix bugs and issues

* fix bugs and issues

* fix bugs and issues

* fix bugs and issues

* update docs and examples

* fix bugs and issues

* update conversion script, fix positional embeddings

* process 2D input ids, update tests

* fix style and quality issues

* update docs

* update docs and imports

* update OWL-ViT index.md

* fix bug in OwlViT feature ext tests

* fix code examples, return_dict by default

* return_dict by default

* minor fixes, add tests to processor

* small fixes

* add output_attentions arg to main model

* fix bugs

* remove output_hidden_states arg from main model

* update self.config variables

* add option to return last_hidden_states

* fix bug in config variables

* fix copied from statements

* fix small issues and bugs

* fix bugs

* fix bugs, support greyscale images

* run fixup

* update repo name

* merge OwlViTImageTextEmbedder with obj detection head

* fix merge conflict

* fix merge conflict

* make fixup

* fix bugs

* fix bugs

* add additional processor test
2022-07-22 13:35:32 +03:00
Sayak Paul
561b9a8c00 [SegFormer] TensorFlow port (#17910)
* add: segformer utils and img. classification.

* add: segmentation layer.

* feat: working implementation of segformer.

* chore: remove unused variable.

* add test, remaining modifications.

* remove: unnecessary files.

* add: rest of the files.

Co-authored-by: matt <rocketknight1@gmail.com>

* chore: remove ModuleList comment.

* chore: apply make style.

* chore: apply make fixup-copies.

* add  to check_repo.py

* add decode head to IGNORE_NON_TESTED

* chore: run make style.

* chore: PR comments.

* chore: minor changes to model doc.

* tests: reduction across samples.

* add a note on the space.

* sort importats.

* fix: reduction in loss computation.

* chore: align loss function with that of NER.

* chore: correct utils/documentation_tests.txt

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* chore: simplify the interpolation of logits in loss computation.

* chore: return transposed logits when return_dict=False.

* chore: add link to the tf fine-tuning repo.

* address pr comments.

* address niels's comments.

* remove from_pt=True since tf weights are in.

* remove comment from pt model.

* address niels's comments.

Co-authored-by: matt <rocketknight1@gmail.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2022-07-21 18:22:37 +01:00
Yih-Dar
9edff45362 skip some test_multi_gpu_data_parallel_forward (#18188)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-20 15:54:44 +02:00
Raghavan
dcec4c4387 Adding OPTForSeqClassification class (#18123)
* Adding OPTForSeqClassification class

* Fix import issues

* Add documentation for optforseqclassification

* Remove checkout

* fix failing tests

* fix typo

* Fix code formatting

* Incorporating the PR feedbacks

* Incorporate PR Feedbacks

* Fix failing test and add new test for multi label setup

* Fix formatting issue

* Fix failing tests

* Fix formatting issues

* Fix failing tests

* Fix failing tests

* Fix failing tests

* Fix failing tests

* PR feedback
2022-07-20 10:14:21 +02:00
Joao Gante
ec6cd7633f TF: Add missing cast to GPT-J (#18201)
* Fix TF GPT-J tests

* add try/finally block
2022-07-19 15:58:42 +01:00
Sylvain Gugger
dc9147ff36 Custom pipeline (#18079)
* Initial work

* More work

* Add tests for custom pipelines on the Hub

* Protect import

* Make the test work for TF as well

* Last PyTorch specific bit

* Add documentation

* Style

* Title in toc

* Bad names!

* Update docs/source/en/add_new_pipeline.mdx

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

* Auto stash before merge of "custom_pipeline" and "origin/custom_pipeline"

* Address review comments

* Address more review comments

* Update src/transformers/pipelines/__init__.py

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2022-07-19 12:02:35 +02:00
Patrick von Platen
3bb6356d4d [From pretrained] Allow download from subfolder inside model repo (#18184)
* add first generation tutorial

* [from_pretrained] Allow loading models from subfolders

* remove gen file

* add doc strings

* allow download from subfolder

* add tests

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply comments

* correct doc string

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-19 11:53:53 +02:00
Younes Belkada
6a1b1bf7a6 BLOOM minor fixes small test (#18175)
* minor fixes

- add correct revision
- corrected dosctring for test
- removed a test

* contrib credits

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: Nouamane Tazi <nouamane98@gmail.com>

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: Nouamane Tazi <nouamane98@gmail.com>
2022-07-18 19:18:19 +02:00
Yih-Dar
6561fbcc6e Update TF(Vision)EncoderDecoderModel PT/TF equivalence tests (#18073)
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-18 15:29:14 +02:00
Yih-Dar
cb19c2afdc Fix expected loss values in some (m)T5 tests (#18177)
* fix expected loss values

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-18 15:26:21 +02:00
gcheron
8c14b342aa add ONNX support for LeVit (#18154)
Co-authored-by: Guilhem Chéron <guilhemc@authentifier.com>
2022-07-18 15:17:07 +02:00
Lysandre Debut
c1c79b0655 NLLB tokenizer (#18126)
* NLLB tokenizer

* Apply suggestions from code review - Thanks Stefan!

Co-authored-by: Stefan Schweter <stefan@schweter.it>

* Final touches

* Style :)

* Update docs/source/en/model_doc/nllb.mdx

Co-authored-by: Stefan Schweter <stefan@schweter.it>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* PR reviews

* Auto models

Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-18 08:12:34 -04:00
Nicolas Patry
ccc0897804 Adding support for device_map directly in pipeline(..) function. (#17902)
* Adding support for `device_map` directly in `pipeline(..)` function.

* Updating the docstring.

* Adding a better docstring

* Put back type hints.

* Blacked. (`make fixup` didn't work ??!!)
2022-07-15 15:54:26 +02:00
amyeroberts
8581a798c0 Add TF DeiT implementation (#17806)
* Initial TF DeiT implementation

* Fix copies naming issues

* Fix up + docs

* Properly same main layer

* Name layers properly

* Initial TF DeiT implementation

* Fix copies naming issues

* Fix up + docs

* Properly same main layer

* Name layers properly

* Fixup

* Fix import

* Fix import

* Fix import

* Fix weight loading for tests whilst not on hub

* Add doc tests and remove to_2tuple

* Add back to_2tuple
Removing to_2tuple results in many downstream changes needed because of the copies checks

* Incorporate updates in Improve vision models #17731 PR

* Don't hard code num_channels

* Copy PyTorch DeiT embeddings and remove pytorch operations with mask

* Fix patch embeddings & tidy up

* Update PixelShuffle to move logic into class layer

* Update doc strings - remove PT references

* Use NHWC format in internal layers

* Fix up

* Use linear activation layer

* Remove unused import

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Move dataclass to top of file

* Remove from_pt now weights on hub

* Fixup

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Amy Roberts <amyeroberts@users.noreply.github.com>
2022-07-13 18:04:08 +01:00
Wei
7ea6ccc2b3 Enable torchdynamo with torch_tensorrt(fx path) (#17765)
* enable fx2trt

* Update perf_train_gpu_one.mdx

* Update perf_train_gpu_one.mdx

* add lib check

* update

* format

* update

* fix import check

* fix isort

* improve doc

* refactor ctx manager

* fix isort

* black format

* isort fix

* fix format

* update args

* update black

* cleanups

* Update perf_train_gpu_one.mdx

* code refactor

* code refactor to init

* remove redundancy

* isort

* replace self.args with args

Co-authored-by: Stas Bekman <stas@stason.org>
2022-07-13 12:43:28 -04:00
Sylvain Gugger
37aeb5787a Make sharded checkpoints work in offline mode (#18125)
* Make sharded checkpoints work in offline mode

* Add test
2022-07-13 12:43:08 -04:00
Joao Gante
20509ab0e0 TF: unpack_inputs decorator independent from main_input_name (#18110) 2022-07-13 10:43:41 +01:00
Joao Gante
981714efe1 CLI: reenable pt_to_tf test (#18108) 2022-07-12 13:38:05 +01:00
Sijun He
d4ebd4e112 speed up test (#18106) 2022-07-12 04:28:28 -04:00
jianan-gu
b7d8bd378c Enhance IPEX integration in Trainer (#18072)
* enhance ipex import

* refine codes

* refine style

* add link

* style

Co-authored-by: Stas Bekman <stas@stason.org>
2022-07-11 21:34:09 -07:00
Younes Belkada
a462fc9232 Bloom Optimize operations (#17866)
* fix tolerance for a bloom slow test

* enhance alibi padding

- get rid of for loops
- deals better with padded batched input
- avoid useless cpu/gpu communication when creating alibi

Co-authored-by: justheuristic <justheuristic@gmail.com>

* optimize attention mask

* fix scaled softmax limit values

* optimize building alibi tensor

Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com>

* fix attention_mask shape when it's None

* minor fixes

- fix docstring + arg names

* remove colons in docstring

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* apply suggestion

* remove unsued arg

* refactor a bit

- use [:, None] for consistency

* refactor attention block

Co-authored-by: Nouamane Tazi <nouamane98@gmail.com>

* quick fixes

* first attempt

* refactor attention block and fix all tests except "test_simple_generation"

- added comments to better explain attention block

* remove debug lines and add TODO comment

* change `torch.bmm` to `torch.baddbmm`
- fixes `test_simple_generation`but breaks `test_batch_generation_padd`

* styling

* all tests are passing now
- use `bmm`
- add explanation for `allow_fp16_reduced_precision_reduction`

Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com>

* styling

Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com>

* fix support for accelerate

Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove attn softmax in fp32

* refactor comments

* refactor a bit

- remove warning message
- remove print on test

* refer to pytorch t5

* change the slow tests

- do the tests in fp32
- remove some comments
- keep large comments

* update expected output for `test_simple_generation`
- we now test using fp32

* make style + change comments a bit

* fix dtype padd test

Co-authored-by: justheuristic <justheuristic@gmail.com>
Co-authored-by: Nouamane Tazi <nouamane98@gmail.com>
Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-11 13:16:13 -04:00
Sylvain Gugger
5ff6f853d7 Mark slow test as such 2022-07-11 12:48:57 -04:00
Sylvain Gugger
6c8017a5c8 Fix image segmentation and object detection pipeline tests (#18100) 2022-07-11 12:41:56 -04:00
Sylvain Gugger
b0520f594c Skip failing tests 2022-07-11 10:16:54 -04:00
Yulv-git
95113d1365 Fix some typos. (#17560)
* Fix some typos.

Signed-off-by: Yulv-git <yulvchi@qq.com>

* Fix typo.

Signed-off-by: Yulv-git <yulvchi@qq.com>

* make fixup.
2022-07-11 05:00:13 -04:00
neverix
8b332a6a16 Make predict() close progress bars after finishing (#17952) (#18078)
* Make Trainer.predict call on_evaluate (#17952)

* Add on_predict

* Small fix

* Small and different fix

* Add tests
2022-07-08 16:44:24 -04:00
Patrick von Platen
2544c1434f [Generate Tests] Make sure no tokens are force-generated (#18053) 2022-07-07 15:08:34 +02:00