* Laying down building stone for more flexible ONNX export capabilities
* Ability to provide a map of config key to override before exporting.
* Makes it possible to export BART with/without past keys.
* Supports simple mathematical syntax for OnnxVariable.repeated
* Effectively apply value override from onnx config for model
* Supports export with additional features such as with-past for seq2seq
* Store the output path directly in the args for uniform usage across.
* Make BART_ONNX_CONFIG_* constants and fix imports.
* Support BERT model.
* Use tokenizer for more flexibility in defining the inputs of a model.
* Add TODO as remainder to provide the batch/sequence_length as CLI args
* Enable optimizations to be done on the model.
* Enable GPT2 + past
* Improve model validation with outputs containing nested structures
* Enable Roberta
* Enable Albert
* Albert requires opset >= 12
* BERT-like models requires opset >= 12
* Remove double printing.
* Enable XLM-Roberta
* Enable DistilBERT
* Disable optimization by default
* Fix missing setattr when applying optimizer_features
* Add value field to OnnxVariable to define constant input (not from tokenizers)
* Add T5 support.
* Simplify model type retrieval
* Example exporting token_classification pipeline for DistilBERT.
* Refactoring to package `transformers.onnx`
* Solve circular dependency & __main__
* Remove unnecessary imports in `__init__`
* Licences
* Use @Narsil's suggestion to forward the model's configuration to the ONNXConfig to avoid interpolation.
* Onnx export v2 fixes (#12388)
* Tiny fixes
Remove `convert_pytorch` from onnxruntime-less runtimes
Correct reference to model
* Style
* Fix Copied from
* LongFormer ONNX config.
* Removed optimizations
* Remvoe bad merge relicas.
* Remove unused constants.
* Remove some deleted constants from imports.
* Fix unittest to remove usage of PyTorch model for onnx.utils.
* Fix distilbert export
* Enable ONNX export test for supported model.
* Style.
* Fix lint.
* Enable all supported default models.
* GPT2 only has one output
* Fix bad property name when overriding config.
* Added unittests and docstrings.
* Disable with_past tests for now.
* Enable outputs validation for default export.
* Remove graph opt lvls.
* Last commit with on-going past commented.
* Style.
* Disabled `with_past` for now
* Remove unused imports.
* Remove framework argument
* Remove TFPreTrainedModel reference
* Add documentation
* Add onnxruntime tests to CircleCI
* Add test
* Rename `convert_pytorch` to `export`
* Use OrderedDict for dummy inputs
* WIP Wav2Vec2
* Revert "WIP Wav2Vec2"
This reverts commit f665efb04c92525c3530e589029f0ae7afdf603e.
* Style
* Use OrderedDict for I/O
* Style.
* Specify OrderedDict documentation.
* Style :)
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Squash all commits of modeling_detr_v7 branch into one
* Improve docs
* Fix tests
* Style
* Improve docs some more and fix most tests
* Fix slow tests of ViT, DeiT and DETR
* Improve replacement of batch norm
* Restructure timm backbone forward
* Make DetrForSegmentation support any timm backbone
* Fix name of output
* Address most comments by @LysandreJik
* Give better names for variables
* Conditional imports + timm in setup.py
* Address additional comments by @sgugger
* Make style, add require_timm and require_vision to testsé
* Remove train_backbone attribute of DetrConfig, add methods to freeze/unfreeze backbone
* Add png files to fixtures
* Fix type hint
* Add timm to workflows
* Add `BatchNorm2d` to the weight initialization
* Fix retain_grad test
* Replace model checkpoints by Facebook namespace
* Fix name of checkpoint in test
* Add user-friendly message when scipy is not available
* Address most comments by @patrickvonplaten
* Remove return_intermediate_layers attribute of DetrConfig and simplify Joiner
* Better initialization
* Scipy is necessary to get sklearn metrics
* Rename TimmBackbone to DetrTimmConvEncoder and rename DetrJoiner to DetrConvModel
* Make style
* Improve docs and add 2 community notebooks
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
* [TokenClassification] Label realignment for subword aggregation
Tentative to replace https://github.com/huggingface/transformers/pull/11622/files
- Added `AggregationStrategy`
- `ignore_subwords` and `grouped_entities` arguments are now fused
into `aggregation_strategy`. It makes more sense anyway because
`ignore_subwords=True` with `grouped_entities=False` did not have a
meaning anyway.
- Added 2 new ways to aggregate which are MAX, and AVERAGE
- AVERAGE requires a bit more information than the others, for now this
case is slightly specific, we should keep that in mind for future
changes.
- Testing has been modified to reflect new argument, and to check the
correct deprecation and the new aggregation_strategy.
- Put the testing argument and testing results for aggregation_strategy,
close together, so that readers can understand what is supposed to
happen.
- `aggregate` is now only tested on a small model as it does not mean
anything to test it globally for all models.
- Previous tests are unchanged in desired output.
- Added a new test case that showcases better the difference between the
FIRST, MAX and AVERAGE strategies.
* Wrong framework.
* Addressing three issues.
1- Tags might not follow B-, I- convention, so any tag should work now
(assumed as B-TAG)
2- Fixed an issue with average that leads to a substantial code change.
3- The testing suite was not checking for the "index" key for "none"
strategy. This is now fixed.
The issue is that "O" could not be chosen by AVERAGE strategy because
those tokens were filtered out beforehand, so their relative scores were
not counted in the average. Now filtering on
ignore_labels will happen at the very end of the pipeline fixing
that issue.
It's a bit hard to make sure this stays like that because we do
not have a end-to-end test for that behavior
* Formatting.
* Adding formatting to code + cleaner handling of B-, I- tags.
Co-authored-by: Francesco Rubbo <rubbo.francesco@gmail.com>
Co-authored-by: elk-cloner <rezakakhki.rk@gmail.com>
* Typo.
Co-authored-by: Francesco Rubbo <rubbo.francesco@gmail.com>
Co-authored-by: elk-cloner <rezakakhki.rk@gmail.com>
* Initial support for upload to hub
* push -> upload
* Fixes + examples
* Fix torchhub test
* Torchhub test I hate you
* push_model_to_hub -> push_to_hub
* Apply mixin to other pretrained models
* Remove ABC inheritance
* Add tests
* Typo
* Run tests
* Install git-lfs
* Change approach
* Add push_to_hub to all
* Staging test suite
* Typo
* Maybe like this?
* More deps
* Cache
* Adapt name
* Quality
* MOAR tests
* Put it in testing_utils
* Docs + torchhub last hope
* Styling
* Wrong method
* Typos
* Update src/transformers/file_utils.py
Co-authored-by: Julien Chaumond <julien@huggingface.co>
* Address review comments
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Julien Chaumond <julien@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* [WIP] Enabling multilingual models for translation pipelines.
* decoder_input_ids -> forced_bos_token_id
* Improve docstring.
* Rebase
* Fixing 2 bugs
- Type token_ids coming from `_parse_and_tokenize`
- Wrong index from tgt_lang.
* Fixing black version.
* Adding tests for _build_translation_inputs and add them for all
tokenizers.
* Mbart actually puts the lang code at the end.
* Fixing m2m100.
* Adding TF support to `deep_round`.
* Update src/transformers/pipelines/text2text_generation.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Adding one line comment.
* Fixing M2M100 `_build_translation_input_ids`, and fix the call site.
* Fixing tests + deep_round -> nested_simplify
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* pass hf optimizer and scheduler to deepspeed if not specified in ds config
* pass hf optimizer and scheduler to deepspeed if not specified in ds config
* update
* make init_deepspeed support config dict
* fix docstring formatting
* clean up trainer's comments
* add new tests
* fix type
* composit argparse doesn't work
* style
* add a new test, rename others
* document new functionality
* complete tests, add docs
* style
* correct level
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* add new methods to the doc
* must tell DS we are using a non-native optimizer
* add protection against cpu_offload + HF optimizer combo
* fix the cli overrides
* sync docs + tests
* restore AdamW
* better docs
* need new version
* no longer needed
* remove outdate information
* refactor duplicated code
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Tests run on Docker
Co-authored-by: Morgan <funtowiczmo@gmail.com>
* Comments from code review
* Reply to itself
* Dependencies
Co-authored-by: Morgan <funtowiczmo@gmail.com>
* Add check-ops script
* Finish to implement check_tf_ops and start the test
* Make the test mandatory only for BERT
* Update tf_ops folder
* Remove useless classes
* Add the ONNX test for GPT2 and BART
* Add a onnxruntime slow test + better opset flexibility
* Fix test + apply style
* fix tests
* Switch min opset from 12 to 10
* Update src/transformers/file_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Fix GPT2
* Remove extra shape_list usage
* Fix GPT2
* Address Morgan's comments
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Don't import libs to check they are available
* Don't import integrations at init
* Add importlib_metdata to deps
* Remove old vars references
* Avoid syntax error
* Adapt testing utils
* Try to appease torchhub
* Add dependency
* Remove more private variables
* Fix typo
* Another typo
* Refine the tf availability test
* First commit: adding all files from tapas_v3
* Fix multiple bugs including soft dependency and new structure of the library
* Improve testing by adding torch_device to inputs and adding dependency on scatter
* Use Python 3 inheritance rather than Python 2
* First draft model cards of base sized models
* Remove model cards as they are already on the hub
* Fix multiple bugs with integration tests
* All model integration tests pass
* Remove print statement
* Add test for convert_logits_to_predictions method of TapasTokenizer
* Incorporate suggestions by Google authors
* Fix remaining tests
* Change position embeddings sizes to 512 instead of 1024
* Comment out positional embedding sizes
* Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES
* Added more model names
* Fix truncation when no max length is specified
* Disable torchscript test
* Make style & make quality
* Quality
* Address CI needs
* Test the Masked LM model
* Fix the masked LM model
* Truncate when overflowing
* More much needed docs improvements
* Fix some URLs
* Some more docs improvements
* Test PyTorch scatter
* Set to slow + minify
* Calm flake8 down
* First commit: adding all files from tapas_v3
* Fix multiple bugs including soft dependency and new structure of the library
* Improve testing by adding torch_device to inputs and adding dependency on scatter
* Use Python 3 inheritance rather than Python 2
* First draft model cards of base sized models
* Remove model cards as they are already on the hub
* Fix multiple bugs with integration tests
* All model integration tests pass
* Remove print statement
* Add test for convert_logits_to_predictions method of TapasTokenizer
* Incorporate suggestions by Google authors
* Fix remaining tests
* Change position embeddings sizes to 512 instead of 1024
* Comment out positional embedding sizes
* Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES
* Added more model names
* Fix truncation when no max length is specified
* Disable torchscript test
* Make style & make quality
* Quality
* Address CI needs
* Test the Masked LM model
* Fix the masked LM model
* Truncate when overflowing
* More much needed docs improvements
* Fix some URLs
* Some more docs improvements
* Add add_pooling_layer argument to TapasModel
Fix comments by @sgugger and @patrickvonplaten
* Fix issue in docs + fix style and quality
* Clean up conversion script and add task parameter to TapasConfig
* Revert the task parameter of TapasConfig
Some minor fixes
* Improve conversion script and add test for absolute position embeddings
* Improve conversion script and add test for absolute position embeddings
* Fix bug with reset_position_index_per_cell arg of the conversion cli
* Add notebooks to the examples directory and fix style and quality
* Apply suggestions from code review
* Move from `nielsr/` to `google/` namespace
* Apply Sylvain's comments
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Rogge Niels <niels.rogge@howest.be>
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
* initial commit
* [cli] lfs commands
* Fix FileSlice
* Tweak to FileSlice
* [hf_api] Backport filetype arg from `datasets`
cc @lhoestq
* Silm down the CI while i'm working
* Ok let's try this in CI
* Update config.yml
* Do not try this at home
* one more try
* Update lfs.py
* Revert "Tweak to FileSlice"
This reverts commit d7e32c4b3500400486411e85a2b74e57fb6b52f5.
* Update test_hf_api.py
* Update test_hf_api.py
* Update test_hf_api.py
* CI still green?
* make CI green again?
* Update test_hf_api.py
* make CI red again?
* Update test_hf_api.py
* add CI style back
* Fix CI?
* oh my
* doc + switch back to real staging endpoint
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com>
* Fix docblock + f-strings
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com>
* [testing utils] get_auto_remove_tmp_dir default change
Now that I have been using `get_auto_remove_tmp_dir default change` for a while, I realized that the defaults aren't most optimal.
99% of the time we want the tmp dir to be empty at the beginning of the test - so changing the default to `before=True` - this shouldn't impact any tests since this feature is used only during debug.
* simplify things
* update docs
* fix doc layout
* style
* Update src/transformers/testing_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* better 3-state doc
* style
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* s/tmp/temporary/ + style
* correct the statement
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* add a multi-gpu job for all example tests
* run only ported tests
* rename
* explain why env is re-activated on each step
* mark all unported/checked tests with @require_torch_non_multigpu_but_fix_me
* style
* Apply suggestions from code review
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
* make it possible to invoke testconf.py in both test suites without crashing on having the same option added
* perl -pi -e 's|--make_reports|--make-reports|' to be consistent with other opts
* add `pytest --make-reports` to all CIs (and artifacts)
* fix
* first draft
* show design proposition for new generate method
* up
* make better readable
* make first version
* gpt2 tests pass
* make beam search for gpt2 work
* add first encoder-decoder code
* delete typo
* make t5 work
* save indermediate
* make bart work with beam search
* finish beam search bart / t5
* add default kwargs
* make more tests pass
* fix no bad words sampler
* some fixes and tests for all distribution processors
* fix test
* fix rag slow tests
* merge to master
* add nograd to generate
* make all slow tests pass
* speed up generate
* fix edge case bug
* small fix
* correct typo
* add type hints and docstrings
* fix typos in tests
* add beam search tests
* add tests for beam scorer
* fix test rag
* finish beam search tests
* move generation tests in seperate file
* fix generation tests
* more tests
* add aggressive generation tests
* fix tests
* add gpt2 sample test
* add more docstring
* add more docs
* finish doc strings
* apply some more of sylvains and sams comments
* fix some typos
* make fix copies
* apply lysandres and sylvains comments
* final corrections on examples
* small fix for reformer
* move the helper code into testing_utils
* port test_trainer_distributed to work with pytest
* improve docs
* simplify notes
* doc
* doc
* style
* doc
* further improvements
* torch might not be available
* real fix
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* better reports
* a whole bunch of reports in their own files
* clean up
* improvements
* github artifacts experiment
* style
* complete the report generator with multiple improvements/fixes
* fix
* save all reports under one dir to easy upload
* can remove temp failing tests
* doc fix
* some cleanup
* Important files
* Styling them all
* Revert "Styling them all"
This reverts commit 7d029395fdae8513b8281cbc2a6c239f8093503e.
* Syling them for realsies
* Fix syntax error
* Fix benchmark_utils
* More fixes
* Fix modeling auto and script
* Remove new line
* Fixes
* More fixes
* Fix more files
* Style
* Add FSMT
* More fixes
* More fixes
* More fixes
* More fixes
* Fixes
* More fixes
* More fixes
* Last fixes
* Make sphinx happy
* WIP refactoring pipeline tests - switching to fast tokenizers
* fix dialog pipeline and fill-mask
* refactoring pipeline tests backbone
* make large tests slow
* fix tests (tf Bart inactive for now)
* fix doc...
* clean up for merge
* fixing tests - remove bart from summarization until there is TF
* fix quality and RAG
* Add new translation pipeline tests - fix JAX tests
* only slow for dialog
* Fixing the missing TF-BART imports in modeling_tf_auto
* spin out pipeline tests in separate CI job
* adding pipeline test to CI YAML
* add slow pipeline tests
* speed up tf and pt join test to avoid redoing all the standalone pt and tf tests
* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
* Update src/transformers/pipelines.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/pipelines.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/testing_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* add require_torch and require_tf in is_pt_tf_cross_test
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* WIP flax bert
* Initial commit Bert Jax/Flax implementation.
* Embeddings working and equivalent to PyTorch.
* Move embeddings in its own module BertEmbeddings
* Added jax.jit annotation on forward call
* BertEncoder on par with PyTorch ! :D
* Add BertPooler on par with PyTorch !!
* Working Jax+Flax implementation of BertModel with < 1e-5 differences on the last layer.
* Fix pooled output to take only the first token of the sequence.
* Refactoring to use BertConfig from transformers.
* Renamed FXBertModel to FlaxBertModel
* Model is now initialized in FlaxBertModel constructor and reused.
* WIP JaxPreTrainedModel
* Cleaning up the code of FlaxBertModel
* Added ability to load Flax model saved through save_pretrained()
* Added ability to convert Pytorch Bert model to FlaxBert
* FlaxBert can now load every Pytorch Bert model with on-the-fly conversion
* Fix hardcoded shape values in conversion scripts.
* Improve the way we handle LayerNorm conversion from PyTorch to Flax.
* Added positional embeddings as parameter of BertModel with default to np.arange.
* Let's roll FlaxRoberta !
* Fix missing position_ids parameters on predict for Bert
* Flax backend now supports batched inputs
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Make it possible to load msgpacked model on convert from pytorch in last resort.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Moved save_pretrained to Jax base class along with more constructor parameters.
* Use specialized, model dependent conversion functio.
* Expose `is_flax_available` in file_utils.
* Added unittest for Flax models.
* Added run_tests_flax to the CI.
* Introduce FlaxAutoModel
* Added more unittests
* Flax model reference the _MODEL_ARCHIVE_MAP from PyTorch model.
* Addressing review comments.
* Expose seed in both Bert and Roberta
* Fix typo suggested by @stefan-it
Co-Authored-By: Stefan Schweter <stefan@schweter.it>
* Attempt to make style
* Attempt to make style in tests too
* Added jax & jaxlib to the flax optional dependencies.
* Attempt to fix flake8 warnings ...
* Redo black again and again
* When black and flake8 fight each other for a space ... 💥💥💥
* Try removing trailing comma to make both black and flake happy!
* Fix invalid is_<framework>_available call, thanks @LysandreJik 🎉
* Fix another invalid import in flax_roberta test
* Bump and pin flax release to 0.1.0.
* Make flake8 happy, remove unused jax import
* Change the type of the catch for msgpack.
* Remove unused import.
* Put seed as optional constructor parameter.
* trigger ci again
* Fix too much parameters in BertAttention.
* Formatting.
* Simplify Flax unittests to avoid machine crashes.
* Fix invalid number of arguments when raising issue for an unknown model.
* Address @bastings comment in PR, moving jax.jit decorated outside of __call__
* Fix incorrect path to require_flax/require_pytorch functions.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Attempt to make style.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Correct rebasing of circle-ci dependencies
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Fix import sorting.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Fix unused imports.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Again import sorting...
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Installing missing nlp dependency for flax unittests.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Fix laoding of model for Flax implementations.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* jit the inner function call to make JAX-compatible
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Format !
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Flake one more time 🎶
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Rewrites BERT in Flax to the new Linen API (#7211)
* Rewrite Flax HuggingFace PR to Linen
* Some fixes
* Fix tests
* Fix CI with change of name of nlp (#7054)
* nlp -> datasets
* More nlp -> datasets
* Woopsie
* More nlp -> datasets
* One last
* Expose `is_flax_available` in file_utils.
* Added run_tests_flax to the CI.
* Attempt to make style
* trigger ci again
* Fix import sorting.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Revert "Rewrites BERT in Flax to the new Linen API (#7211)"
This reverts commit 23703a5eb3364e26a1cbc3ee34b4710d86a674b0.
* Remove jnp.lax references
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Make style.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Reintroduce Linen changes ...
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Make style.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Use jax native's gelu function.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Renaming BertModel to BertModule to highlight the fact this is the Flax Module object.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Rewrite FlaxAutoModel test to not rely on pretrained_model_archive_map
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Remove unused variable in BertModule.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Remove unused variable in BertModule again
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Attempt to have is_flax_available working again.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Introduce JAX TensorType
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Improve ImportError message when trying to convert to various TensorType format.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Makes Flax model jittable.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Ensure flax models are jittable in unittests.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Remove unused imports.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Ensure jax imports are guarded behind is_flax_available.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Make style.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Make style again
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Make style again again
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Make style again again again
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Update src/transformers/file_utils.py
Co-authored-by: Marc van Zee <marcvanzee@gmail.com>
* Bump flax to it's latest version
Co-authored-by: Marc van Zee <marcvanzee@gmail.com>
* Bump jax version to at least 0.2.0
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Style.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Update the unittest to use TensorType.JAX
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* isort import in tests.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Match new flax parameters name "params"
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Remove unused imports.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Add flax models to transformers __init__
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Attempt to address all CI related comments.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Correct circle.yml indent.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Correct circle.yml indent (2)
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Remove coverage from flax tests
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Addressing many naming suggestions from comments
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Simplify for loop logic to interate over layers in FlaxBertLayerCollection
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* use f-string syntax for formatting logs.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Use config property from FlaxPreTrainedModel.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* use "cls_token" instead of "first_token" variable name.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* use "hidden_state" instead of "h" variable name.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Correct class reference in docstring to link to Flax related modules.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Added HF + Google Flax team copyright.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Make Roberta independent from Bert
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Move activation functions to flax_utils.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Move activation functions to flax_utils for bert.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Added docstring for BERT
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Update import for Bert and Roberta tokenizers
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Make style.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* fix-copies
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Correct FlaxRobertaLayer to match PyTorch.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Use the same store_artifact for flax unittest
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Style.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Make sure gradient are disabled only locally for flax unittest using torch equivalence.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Use relative imports
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Marc van Zee <marcvanzee@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* splitting fast and slow tokenizers [WIP]
* [WIP] splitting sentencepiece and tokenizers dependencies
* update dummy objects
* add name_or_path to models and tokenizers
* prefix added to file names
* prefix
* styling + quality
* spliting all the tokenizer files - sorting sentencepiece based ones
* update tokenizer version up to 0.9.0
* remove hard dependency on sentencepiece 🎉
* and removed hard dependency on tokenizers 🎉
* update conversion script
* update missing models
* fixing tests
* move test_tokenization_fast to main tokenization tests - fix bugs
* bump up tokenizers
* fix bert_generation
* update ad fix several tokenizers
* keep sentencepiece in deps for now
* fix funnel and deberta tests
* fix fsmt
* fix marian tests
* fix layoutlm
* fix squeezebert and gpt2
* fix T5 tokenization
* fix xlnet tests
* style
* fix mbart
* bump up tokenizers to 0.9.2
* fix model tests
* fix tf models
* fix seq2seq examples
* fix tests without sentencepiece
* fix slow => fast conversion without sentencepiece
* update auto and bert generation tests
* fix mbart tests
* fix auto and common test without tokenizers
* fix tests without tokenizers
* clean up tests lighten up when tokenizers + sentencepiece are both off
* style quality and tests fixing
* add sentencepiece to doc/examples reqs
* leave sentencepiece on for now
* style quality split hebert and fix pegasus
* WIP Herbert fast
* add sample_text_no_unicode and fix hebert tokenization
* skip FSMT example test for now
* fix style
* fix fsmt in example tests
* update following Lysandre and Sylvain's comments
* Update src/transformers/testing_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/testing_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>