Nicolas Patry
c2d0ffec8c
Adding a new return_full_text parameter to TextGenerationPipeline. ( #9852 )
...
* Adding a new `return_full_text` parameter to TextGenerationPipeline.
For text-generation, it's sometimes used as prompting text.
In that context, prefixing `generated_text` with the actual input
forces the caller to take an extra step to remove it.
The proposed change adds a new parameter (for backward compatibility).
`return_full_text` that enables the caller to prevent adding the prefix.
* Doc quality.
2021-01-29 10:27:32 +01:00
abhishek thakur
bc109ae5b8
pin_memory -> dataloader_pin_memory ( #9874 )
2021-01-28 21:10:46 +01:00
abhishek thakur
80e4184fb0
on_log event should occur *after* the current log is written ( #9872 )
2021-01-28 19:11:04 +01:00
Sylvain Gugger
b4e559cfa1
Deprecate model_path in Trainer.train ( #9854 )
2021-01-28 08:32:46 -05:00
Funtowicz Morgan
2ee9f9b69e
Fix computation of attention_probs when head_mask is provided. ( #9853 )
...
* Fix computation of attention_probs when head_mask is provided.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com >
* Apply changes to the template
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr >
2021-01-28 06:11:52 -05:00
Lysandre Debut
6cb0a6f01a
Partial local tokenizer load ( #9807 )
...
* Allow partial loading of a cached tokenizer
* Warning > Info
* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* Raise error if not local_files_only
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2021-01-28 03:29:12 -05:00
abhishek thakur
25fcb5c171
Pin memory in Trainer by default ( #9857 )
...
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com >
2021-01-28 08:50:46 +01:00
Stefan Schweter
5ed5a54684
ADD BORT ( #9813 )
...
* tests: add integration tests for new Bort model
* bort: add conversion script from Gluonnlp to Transformers 🚀
* bort: minor cleanup (BORT -> Bort)
* add docs
* make fix-copies
* clean doc a bit
* correct docs
* Update docs/source/model_doc/bort.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* Update docs/source/model_doc/bort.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* correct dialogpt doc
* correct link
* Update docs/source/model_doc/bort.rst
* Update docs/source/model_doc/dialogpt.rst
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com >
* make style
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com >
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2021-01-27 21:25:11 +03:00
Stas Bekman
7c6d63298f
[traner] fix --lr_scheduler_type choices ( #9800 )
...
* fix --lr_scheduler_type choices
* rewrite to fix for all enum-based cl args
* cleanup
* adjust test
* style
* Proposal that should work
* Remove needless code
* Fix test
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com >
2021-01-27 10:12:15 -05:00
Sylvain Gugger
893120facc
Allow --arg Value for booleans in HfArgumentParser ( #9823 )
...
* Allow --arg Value for booleans in HfArgumentParser
* Update last test
* Better error message
2021-01-27 09:31:42 -05:00
Sylvain Gugger
35d55b7b84
When resuming training from checkpoint, Trainer loads model ( #9818 )
...
* Whenresuming training from checkpoint, Trainer loads model
* Finish cleaning tests
* Address review comment
* Use global_step from state
2021-01-27 09:31:18 -05:00
Kiyoung Kim
20932e5520
Add tpu_zone and gcp_project in training_args_tf.py ( #9825 )
...
* add tpu_zone and gcp_project in training_args_tf.py
* make style
Co-authored-by: kykim <kykim>
2021-01-27 08:45:09 -05:00
Julien Plu
bd701ab1a0
Fix template ( #9840 )
2021-01-27 07:40:30 -05:00
Sylvain Gugger
c7b7bd9963
Add a flag for find_unused_parameters ( #9820 )
...
* Add a flag for find_unused_parameters
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com >
* Remove negation
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com >
2021-01-27 06:18:06 -05:00
Julien Plu
4adbdce5ee
Clean TF Bert ( #9788 )
...
* Start cleaning BERT
* Clean BERT and all those depends of it
* Fix attribute name
* Apply style
* Apply Sylvain's comments
* Apply Lysandre's comments
* remove unused import
2021-01-27 11:28:11 +01:00
tomohideshibata
f0329ea516
Delete a needless duplicate condition ( #9826 )
...
Co-authored-by: Tomohide Shibata <tomshiba@yahoo-corp.jp >
2021-01-27 13:15:23 +03:00
Julien Plu
a1720694a5
Remove a TF usage warning and rework the documentation ( #9756 )
...
* Rework documentation
* Update the template
* Trigger CI
* Restore the warning but with the TF logger
* Update convbert doc
2021-01-27 10:45:42 +01:00
Nicolas Patry
285c6262a8
Adding a test to prevent late failure in the Table question answering ( #9808 )
...
pipeline.
- If table is empty then the line that contain `answer[0]` will fail.
- This PR add a check to prevent `answer[0]`.
- Also adds an early check for presence of `table` and `query` to
prevent late failure and give better error message.
- Adds a few tests to make sure these errors are correctly raised.
2021-01-27 04:10:53 -05:00
Patrick von Platen
a46050d0f5
fix typo with mt5 init ( #9830 )
2021-01-27 04:09:56 -05:00
jncasey
f4bf0dea46
Fix auto-resume training from checkpoint ( #9822 )
...
* Fix auto-resume training from checkpoint
* style fixes
2021-01-27 03:48:18 -05:00
Patrick von Platen
d5b40d6693
[Setup.py] update jaxlib ( #9831 )
...
* update jaxlib
* Update setup.py
* update table
2021-01-27 11:34:21 +03:00
abhishek thakur
f617490e71
ConvBERT Model ( #9717 )
...
* finalize convbert
* finalize convbert
* fix
* fix
* fix
* push
* fix
* tf image patches
* fix torch model
* tf tests
* conversion
* everything aligned
* remove print
* tf tests
* fix tf
* make tf tests pass
* everything works
* fix init
* fix
* special treatment for sepconv1d
* style
* 🙏🏽
* add doc and cleanup
* add electra test again
* fix doc
* fix doc again
* fix doc again
* Update src/transformers/modeling_tf_pytorch_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
* Update src/transformers/models/conv_bert/configuration_conv_bert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
* Update docs/source/model_doc/conv_bert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* Update src/transformers/models/auto/configuration_auto.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* Update src/transformers/models/conv_bert/configuration_conv_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* conv_bert -> convbert
* more fixes from review
* add conversion script
* dont use pretrained embed
* unused config
* suggestions from julien
* some more fixes
* p -> param
* fix copyright
* fix doc
* Update src/transformers/models/convbert/configuration_convbert.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com >
* comments from reviews
* fix-copies
* fix style
* revert shape_list
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com >
2021-01-27 03:20:09 -05:00
Patrick von Platen
e575e06287
fix led not defined ( #9828 )
2021-01-27 10:43:14 +03:00
Tristan Deleu
eba418ac5d
Commit the last step on world_process_zero in WandbCallback ( #9805 )
...
* Commit the last step on world_process_zero in WandbCallback
* Use the environment variable WANDB_LOG_MODEL as a default value in WandbCallback
2021-01-26 13:21:26 -05:00
Derrick Blakely
8edc98bb70
Allow RAG to output decoder cross-attentions ( #9789 )
...
* get cross attns
* add cross-attns doc strings
* fix typo
* line length
* Apply suggestions from code review
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com >
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com >
2021-01-26 20:32:46 +03:00
Michael Glass
c37dcff764
Fixed parameter name for logits_processor ( #9790 )
2021-01-26 18:44:02 +03:00
Sylvain Gugger
0d0efd3a0e
Smdistributed trainer ( #9798 )
...
* Add a debug print
* Adapt Trainer to use smdistributed if available
* Forgotten parenthesis
* Real check for sagemaker
* Donforget to define device...
* Woopsie, local)rank is defined differently
* Update since local_rank has the proper value
* Remove debug statement
* More robust check for smdistributed
* Quality
* Deal with key not present error
2021-01-26 10:28:21 -05:00
Nicolas Patry
781e4b1384
Adding skip_special_tokens=True to FillMaskPipeline ( #9783 )
...
* We most likely don't want special tokens in this output.
* Adding `skip_special_tokens=True` to FillMaskPipeline
- It's backward incompatible.
- It makes for sense for pipelines to remove references to
special_tokens (all of the other pipelines do that).
- Keeping special tokens makes it hard for users to actually remove them
because all models have different tokens (<s>, <cls>, [CLS], ....)
* Fixing `token_str` in the same vein, and actually fix the tests too !
2021-01-26 10:06:28 +01:00
Daniel Stancl
1867d9a8d7
Add head_mask/decoder_head_mask for TF BART models ( #9639 )
...
* Add head_mask/decoder_head_mask for TF BART models
* Add head_mask and decoder_head_mask input arguments for TF BART-based
models as a TF counterpart to the PR #9569
* Add test_headmasking functionality to tests/test_modeling_tf_common.py
* TODO: Add a test to verify that we can get a gradient back for
importance score computation
* Remove redundant #TODO note
Remove redundant #TODO note from tests/test_modeling_tf_common.py
* Fix assertions
* Make style
* Fix ...Model input args and adjust one new test
* Add back head_mask and decoder_head_mask to BART-based ...Model
after the last commit
* Remove head_mask ande decoder_head_mask from input_dict
in TF test_train_pipeline_custom_model as these two have different
shape than other input args (Necessary for passing this test)
* Revert adding global_rng in test_modeling_tf_common.py
2021-01-26 03:50:00 -05:00
Sylvain Gugger
af41da5097
Fix style
2021-01-25 12:40:58 -05:00
Sylvain Gugger
caf4abf768
Auto-resume training from checkpoint ( #9776 )
...
* Auto-resume training from checkpoint
* Update examples/text-classification/run_glue.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
* Roll out to other examples
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
2021-01-25 12:03:51 -05:00
Lysandre Debut
0f443436fb
Actual fix ( #9787 )
2021-01-25 11:12:07 -05:00
Stas Bekman
fac7cfb16a
[fsmt] onnx triu workaround ( #9738 )
...
* onnx triu workaround
* style
* working this time
* add test
* more efficient version
2021-01-25 08:57:37 -05:00
Sorami Hisamoto
626116b7d7
Fix a typo in Trainer.hyperparameter_search docstring ( #9762 )
...
`compute_objectie` => `compute_objective`
2021-01-25 06:40:03 -05:00
Kai Fricke
d63ab61525
Use object store to pass trainer object to Ray Tune ( #9749 )
2021-01-25 05:01:55 -05:00
Maria Janina Sarol
6312fed47d
Fix TFTrainer prediction output ( #9662 )
...
* Fix TFTrainer prediction output
* Update trainer_tf.py
* Fix TFTrainer prediction output
* Fix evaluation_loss update in TFTrainer
* Fix TFTrainer prediction output
2021-01-25 10:27:12 +01:00
Stas Bekman
b7b7e5d049
token_type_ids isn't used ( #9736 )
2021-01-22 20:38:53 -08:00
Sylvain Gugger
82d46febeb
Add report_to training arguments to control the reporting integrations used ( #9735 )
2021-01-22 10:34:34 -05:00
Julien Plu
d7c31abf38
Fix some TF slow tests ( #9728 )
...
* Fix saved model tests + fix a graph issue in longformer
* Apply style
2021-01-22 14:50:46 +01:00
Sylvain Gugger
5f80c15ef5
Fix memory regression in Seq2Seq example ( #9713 )
...
* Fix memory regression in Seq2Seq example
* Fix test and properly deal with -100
* Easier condition with device safety
* Patch for MBartTokenzierFast
2021-01-21 12:05:46 -05:00
Julien Plu
a7dabfb3d1
Fix TF s2s models ( #9478 )
...
* Fix Seq2Seq models for serving
* Apply style
* Fix lonfgormer
* Fix mBart/Pegasus/Blenderbot
* Apply style
* Add a main intermediate layer
* Apply style
* Remove import
* Apply tf.function to Longformer
* Fix utils check_copy
* Update S2S template
* Fix BART + Blenderbot
* Fix BlenderbotSmall
* Fix BlenderbotSmall
* Fix BlenderbotSmall
* Fix MBart
* Fix Marian
* Fix Pegasus + template
* Apply style
* Fix common attributes test
* Forgot to fix the LED test
* Apply Patrick's comment on LED Decoder
2021-01-21 17:03:29 +01:00
Nicolas Patry
23e5a36ee6
Changing model default for TableQuestionAnsweringPipeline. ( #9729 )
...
* Changing model default for TableQuestionAnsweringPipeline.
- Discussion: https://discuss.huggingface.co/t/table-question-answering-is-not-an-available-task-under-pipeline/3284/6
* Updating slow tests that were out of sync.
2021-01-21 14:31:51 +01:00
Julien Plu
3f290e6c84
Fix mixed precision in TF models ( #9163 )
...
* Fix Gelu precision
* Fix gelu_fast
* Naming
* Fix usage and apply style
* add TF gelu approximate version
* add TF gelu approximate version
* add TF gelu approximate version
* Apply style
* Fix albert
* Remove the usage of the Activation layer
2021-01-21 07:00:11 -05:00
Suraj Patil
248fa1ae72
fix T5 head mask in model_parallel ( #9726 )
...
* fix head mask in model_parallel
* pass correct head mask
2021-01-21 12:16:14 +01:00
Patrick von Platen
ca422e3d7d
finish ( #9721 )
2021-01-21 05:17:13 -05:00
guillaume-be
fb36c273a2
Allow text generation for ProphetNetForCausalLM ( #9707 )
...
* Moved ProphetNetForCausalLM's parent initialization after config update
* Added unit tests for generation for ProphetNetForCausalLM
2021-01-21 11:13:38 +01:00
Muennighoff
6a346f0358
fix typo ( #9708 )
...
* fix typo
Co-authored-by: Suraj Patil <surajp815@gmail.com >
2021-01-21 13:51:01 +05:30
Stas Bekman
4a20b7c450
[trainer] no --deepspeed and --sharded_ddp together ( #9712 )
...
* no --deepspeed and --sharded_ddp together
* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2021-01-20 16:50:21 -08:00
Sylvain Gugger
3cd91e8162
Fix WAND_DISABLED test ( #9703 )
...
* Fix WAND_DISABLED test
* Remove duplicate import
* Make a test that actually works...
* Fix style
2021-01-20 12:30:24 -05:00
Sylvain Gugger
2a703773aa
Fix style
2021-01-20 12:17:40 -05:00