Sam Shleifer
9e89390ce1
[QOL] add signature for prepare_seq2seq_batch ( #7108 )
2020-09-14 20:33:08 -04:00
Kevin Canwen Xu
90cde2e938
Add Mirror Option for Downloads ( #6679 )
...
* Add Tuna Mirror for Downloads from China
* format fix
* Use preset instead of hardcoding URL
* Fix
* make style
* update the mirror option doc
* update the mirror
2020-09-14 23:50:22 +08:00
Sylvain Gugger
ccc8e30c8a
Clean up autoclass doc ( #7081 )
2020-09-14 09:26:41 -04:00
Stas Bekman
4d39148419
fix deprecation warnings ( #7033 )
...
* fix deprecation warnings
* remove tests/test_tokenization_common.py's test_padding_to_max_length
* revert test_padding_to_max_length
2020-09-14 07:51:19 -04:00
Sam Shleifer
54395d87a6
Update xsum length penalty to better values ( #7107 )
2020-09-13 20:48:47 -04:00
Sam Shleifer
0ec63afec2
fix bug in pegasus converter ( #7094 )
2020-09-13 15:11:47 -04:00
Suraj Patil
0a8c17d53c
[T5Tokenizer] remove prefix_tokens ( #7078 )
2020-09-11 14:18:45 -04:00
Sylvain Gugger
4cbd50e611
Compute loss method ( #7074 )
2020-09-11 12:06:31 -04:00
Sylvain Gugger
ae736163d0
Add tests and fix various bugs in ModelOutput ( #7073 )
...
* Add tests and fix various bugs in ModelOutput
* Update tests/test_model_output.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com >
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com >
2020-09-11 12:01:33 -04:00
Sylvain Gugger
e841b75dec
Automate the lists in auto-xxx docs ( #7061 )
...
* More readable dict
* More nlp -> datasets
* Revert "More nlp -> datasets"
This reverts commit 3cd1883d226c63c4a686fc1fed35f2cd586ebe45.
* Automate the lists in auto-xxx docs
* More readable dict
* Revert "More nlp -> datasets"
This reverts commit 3cd1883d226c63c4a686fc1fed35f2cd586ebe45.
* Automate the lists in auto-xxx docs
* nlp -> datasets
* Fix new key
2020-09-11 10:42:09 -04:00
Patrick von Platen
221d4c63a3
clean naming ( #7068 )
2020-09-11 09:57:53 +02:00
Stas Bekman
8fcbe486e1
these tests require non-multigpu env ( #7059 )
...
* these tests require non-multigpu env
* cleanup
* clarify
2020-09-10 18:52:55 -04:00
Sylvain Gugger
514486739c
Fix CI with change of name of nlp ( #7054 )
...
* nlp -> datasets
* More nlp -> datasets
* Woopsie
* More nlp -> datasets
* One last
2020-09-10 14:51:08 -04:00
Stas Bekman
df4594a9da
[xlm tok] config dict: fix str into int to match definition ( #7034 )
2020-09-10 19:31:01 +02:00
Julien Chaumond
d6c08b07a0
[AutoTokenizer] Correct error message
2020-09-10 17:19:01 +02:00
Ashwin Geet Dsa
66a5a6fda8
fix to ensure that returned tensors after the tokenization is Long ( #7039 )
...
* fix to ensure that returned tensors after the tokenization is Long
* fix to ensure that returned tensors after the tokenization is Long
Co-authored-by: Ashwin Geet Dsa <adsa@grvingt-6.nancy.grid5000.fr >
2020-09-10 11:04:03 -04:00
Sylvain Gugger
15a189049e
Add TF Funnel Transformer ( #7029 )
...
* Add TF Funnel Transformer
* Proper dummy input
* Formatting
* Update src/transformers/modeling_tf_funnel.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
* Address review comments
* One review comment forgotten
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
2020-09-10 10:41:56 -04:00
Patrick von Platen
7fd1febf38
Add "Leveraging Pretrained Checkpoints for Generation" Seq2Seq models. ( #6594 )
...
* add conversion script
* improve conversion script
* make style
* add tryout files
* fix
* update
* add causal bert
* better names
* add tokenizer file as well
* finish causal_bert
* fix small bugs
* improve generate
* change naming
* renaming
* renaming
* renaming
* remove leftover files
* clean files
* add fix tokenizer
* finalize
* correct slow test
* update docs
* small fixes
* fix link
* adapt check repo
* apply sams and sylvains recommendations
* fix import
* implement Lysandres recommendations
* fix logger warn
2020-09-10 16:40:51 +02:00
Yu Liu
762cba3bda
Albert pretrain datasets/ datacollator ( #6168 )
...
* add dataset for albert pretrain
* datacollator for albert pretrain
* naming, comprehension, file reading change
* data cleaning is no needed after this modification
* delete prints
* fix a bug
* file structure change
* add tests for albert datacollator
* remove random seed
* add back len and get item function
* sample file for testing and test code added
* format change for black
* more format change
* Style
* var assignment issue resolve
* add back wrongly deleted DataCollatorWithPadding in init file
* Style
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr >
2020-09-10 07:56:29 -04:00
Johann C. Rocholl
49e9be0639
Fix confusing warnings during TF2 import from PyTorch ( #6623 )
...
1. Swapped missing_keys and unexpected_keys.
2. Copy&paste error caused these warnings to say "from TF 2.0" when it's actually "from PyTorch".
2020-09-10 05:31:59 -04:00
Stas Bekman
4ee1053dcf
add -y to bypass prompt for transformers-cli upload ( #7035 )
2020-09-10 04:58:29 -04:00
Lysandre Debut
15478c1287
Batch encore plus and overflowing tokens fails when non existing overflowing tokens for a sequence ( #6677 )
...
* Patch and test
* Fix tests
2020-09-09 06:55:17 -04:00
Henry Dashwood
9fd11bf1a8
replace torch.triu with onnx compatible code ( #6929 )
2020-09-09 04:56:40 -04:00
Julien Chaumond
ed71c21d6a
[from_pretrained] Allow tokenizer_type ≠ model_type ( #6995 )
2020-09-09 04:22:59 -04:00
Stas Bekman
03e363f9ae
[generation] consistently add eos tokens ( #6982 )
...
Currently beam search returns inconsistent outputs - if hypos have different lengths we get eos, if they are the same - we don't.
This PR makes the output consistent.
Also why not also replace:
```
if sent_lengths[i] < max_length:
decoded[i, sent_lengths[i]] = eos_token_id
```
with:
```
decoded[i, sent_lengths[i]] = eos_token_id
```
Shouldn't eos always be there? If the data gets truncated, the caller needs to user a larger `max_length`.
Please correct me if my logic is flawed.
2020-09-09 04:08:36 -04:00
Stas Bekman
d0963486c1
adding TRANSFORMERS_VERBOSITY env var ( #6961 )
...
* introduce TRANSFORMERS_VERBOSITY env var + test + test helpers
* cleanup
* remove helper function
2020-09-09 04:08:01 -04:00
Patrick von Platen
120176ea29
[Longformer] Fix longformer documentation ( #7016 )
...
* fix longformer
* allow position ids to not be initialized
2020-09-08 18:51:28 +02:00
Lysandre Debut
5c4eb4b1ac
Fixing FLOPS merge by checking if torch is available ( #7013 )
...
* Should check if `torch` is available
* fixed samples_count error, distributed_concat arguments
* style
* Import torch at beginning of file
Co-authored-by: TevenLeScao <teven.lescao@gmail.com >
2020-09-08 10:51:58 -04:00
Teven
01d340adfa
Floating-point operations logging in trainer ( #6768 )
...
* neFLOs calculation, logging, and reloading (#1 )
* testing distributed consecutive batches
* fixed AttributeError from DataParallel
* removed verbosity
* rotate with use_mtime=True
* removed print
* fixed interaction with gradient accumulation
* indent formatting
* distributed neflo counting
* fixed typo
* fixed typo
* mean distributed losses
* exporting log history
* moved a few functions
* floating_point_ops clarification for transformers with parameter-reuse
* code quality
* double import
* made flo estimation more task-agnostic
* only logging flos if computed
* code quality
* unused import
* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* Sylvain review
* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* black
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2020-09-08 10:00:56 -04:00
Sylvain Gugger
d155b38d6e
Funnel transformer ( #6908 )
...
* Initial model
* Fix upsampling
* Add special cls token id and test
* Formatting
* Test and fist FunnelTokenizerFast
* Common tests
* Fix the check_repo script and document Funnel
* Doc fixes
* Add all models
* Write doc
* Fix test
* Initial model
* Fix upsampling
* Add special cls token id and test
* Formatting
* Test and fist FunnelTokenizerFast
* Common tests
* Fix the check_repo script and document Funnel
* Doc fixes
* Add all models
* Write doc
* Fix test
* Fix copyright
* Forgot some layers can be repeated
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com >
* Update src/transformers/modeling_funnel.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
* Address review comments
* Update src/transformers/modeling_funnel.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com >
* Address review comments
* Update src/transformers/modeling_funnel.py
Co-authored-by: Sam Shleifer <sshleifer@gmail.com >
* Slow integration test
* Make small integration test
* Formatting
* Add checkpoint and separate classification head
* Formatting
* Expand list, fix link and add in pretrained models
* Styling
* Add the model in all summaries
* Typo fixes
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com >
Co-authored-by: Sam Shleifer <sshleifer@gmail.com >
2020-09-08 08:08:08 -04:00
Stuart Mesham
25afb4ea50
fixed trainer tr_loss memory leak ( #6999 )
...
* fixed trainer tr_loss memory leak
* detached returned training loss from computation graph in the Trainer class' training_step() method
* Revert "fixed trainer tr_loss memory leak"
This reverts commit 47226e4e
2020-09-08 08:07:33 -04:00
Stas Bekman
c18f5916a0
typo ( #7001 )
...
apologies for the tiny PRs, just sending those as I find them.
2020-09-08 01:22:20 -04:00
Jangwon Park
90ec78b514
Add missing arguments for BertWordPieceTokenizer ( #5810 )
2020-09-07 08:35:41 -04:00
Lysandre Debut
77cd0e13d2
Conversion scripts shouldn't have relative imports ( #6991 )
2020-09-07 08:31:06 -04:00
Stas Bekman
848fbe1e35
[gen utils] missing else case ( #6980 )
...
* [gen utils] missing else case
1. `else` is missing - I hit that case while porting a model. Probably needs to assert there?
2. also the comment on top seems to be outdated (just vocab_size is being set there)
* typo
2020-09-07 07:28:06 -04:00
tznurmin
f7e80721eb
Fixed the default number of attention heads in Reformer Configuration ( #6973 )
2020-09-07 12:12:22 +02:00
Stas Bekman
acfaad74ab
[docstring] missing arg ( #6933 )
...
* [docstring] missing arg
add the missing `tie_word_embeddings` entry
* cleanup
* Update src/transformers/configuration_reformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2020-09-07 05:36:16 -04:00
Stas Bekman
c3317e1f80
typo ( #6959 )
...
there is no var `decoder_input_ids`, but there is `input_ids` for decoder :)
2020-09-07 05:16:24 -04:00
Lysandre Debut
9ef9c39728
Cannot index None ( #6984 )
2020-09-07 04:56:08 -04:00
Sylvain Gugger
08de989a0a
Trainer with grad accum ( #6930 )
...
* Add warning for gradient accumulation
* Formatting
2020-09-07 04:54:00 -04:00
Boris Dayma
995a958dd1
feat: allow prefix for any generative model ( #5885 )
...
* feat: allow padding_text for any generative model
* docs(pipelines.py): correct typo
* Update src/transformers/pipelines.py
Co-authored-by: Sam Shleifer <sshleifer@gmail.com >
* feat: rename padding_text to prefix
* fix: cannot tokenize empty text
* fix: pass prefix arg to pipeline
* test: add prefix to text-generetation pipeline
* style: fix style
* style: clean code and variable name more explicit
* set arg docstring to optional
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
Co-authored-by: Sam Shleifer <sshleifer@gmail.com >
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2020-09-07 03:03:45 -04:00
Stas Bekman
48ff6d5109
[doc] remove the implied defaults to :obj:None, s/True/ :obj:`True/, etc. ( #6956 )
...
* remove the implied defaults to :obj:`None`
* fix bug in the original
* replace to :obj:`True`, :obj:`False`
2020-09-04 18:22:25 -04:00
Stas Bekman
eff274d629
typo ( #6952 )
2020-09-04 16:14:37 -04:00
Stas Bekman
c5d43a872f
[docstring] misc arg doc corrections ( #6932 )
...
* correct bool types
fix docstring s/int/bool/
* fix description
* fix num_labels to match reality
2020-09-04 10:09:42 -04:00
Yih-Dar
a75e319819
Fix mixed precision issue in TF DistilBert ( #6915 )
...
* Remove hard-coded uses of float32 to fix mixed precision use in TF Distilbert
* fix style
* fix gelu dtype issue in TF Distilbert
* fix numeric overflow while using half precision
2020-09-04 14:29:57 +02:00
krfricke
0f360d3d1c
move wandb/comet logger init to train() to allow parallel logging ( #6850 )
...
* move wandb/comet logger init to train() to allow parallel logging
* Setup wandb/comet loggers on first call to log()
2020-09-03 11:49:14 -04:00
Antonio V Mendoza
ea2c6f1afc
Adding the LXMERT pretraining model (MultiModal languageXvision) to HuggingFace's suite of models ( #5793 )
...
* added template files for LXMERT and competed the configuration_lxmert.py
* added modeling, tokization, testing, and finishing touched for lxmert [yet to be tested]
* added model card for lxmert
* cleaning up lxmert code
* Update src/transformers/modeling_lxmert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
* Update src/transformers/modeling_tf_lxmert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
* Update src/transformers/modeling_tf_lxmert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
* Update src/transformers/modeling_lxmert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
* tested torch lxmert, changed documtention, updated outputs, and other small fixes
* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
* renaming, other small issues, did not change TF code in this commit
* added lxmert question answering model in pytorch
* added capability to edit number of qa labels for lxmert
* made answer optional for lxmert question answering
* add option to return hidden_states for lxmert
* changed default qa labels for lxmert
* changed config archive path
* squshing 3 commits: merged UI + testing improvments + more UI and testing
* changed some variable names for lxmert
* TF LXMERT
* Various fixes to LXMERT
* Final touches to LXMERT
* AutoTokenizer order
* Add LXMERT to index.rst and README.md
* Merge commit test fixes + Style update
* TensorFlow 2.3.0 sequential model changes variable names
Remove inherited test
* Update src/transformers/modeling_tf_pytorch_utils.py
* Update docs/source/model_doc/lxmert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* Update docs/source/model_doc/lxmert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* Update src/transformers/modeling_tf_lxmert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* added suggestions
* Fixes
* Final fixes for TF model
* Fix docs
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr >
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2020-09-03 04:02:25 -04:00
Sylvain Gugger
8f2723caf0
Output attention takes an s ( #6903 )
...
* Fix output_attention -> output_attentions
* Formatting
* One unsaved file
2020-09-02 08:11:45 -04:00
Yohei Tamura
485da7222f
fix error class instantiation ( #6634 )
2020-09-02 07:36:32 -04:00
Suraj Patil
4230d30f77
[pipelines] Text2TextGenerationPipeline ( #6744 )
...
* add Text2TextGenerationPipeline
* remove max length warning
* remove comments
* remove input_length
* fix typo
* add tests
* use TFAutoModelForSeq2SeqLM
* doc
* typo
* add the doc below TextGenerationPipeline
* doc nit
* style
* delete comment
2020-09-02 07:34:35 -04:00