Commit Graph

5446 Commits

Author SHA1 Message Date
Thomas Wolf
9aeacb58ba Adding Fast tokenizers for SentencePiece based tokenizers - Breaking: remove Transfo-XL fast tokenizer (#7141)
* [WIP] SP tokenizers

* fixing tests for T5

* WIP tokenizers

* serialization

* update T5

* WIP T5 tokenization

* slow to fast conversion script

* Refactoring to move tokenzier implementations inside transformers

* Adding gpt - refactoring - quality

* WIP adding several tokenizers to the fast world

* WIP Roberta - moving implementations

* update to dev4 switch file loading to in-memory loading

* Updating and fixing

* advancing on the tokenizers - updating do_lower_case

* style and quality

* moving forward with tokenizers conversion and tests

* MBart, T5

* dumping the fast version of transformer XL

* Adding to autotokenizers + style/quality

* update init and space_between_special_tokens

* style and quality

* bump up tokenizers version

* add protobuf

* fix pickle Bert JP with Mecab

* fix newly added tokenizers

* style and quality

* fix bert japanese

* fix funnel

* limite tokenizer warning to one occurence

* clean up file

* fix new tokenizers

* fast tokenizers deep tests

* WIP adding all the special fast tests on the new fast tokenizers

* quick fix

* adding more fast tokenizers in the fast tests

* all tokenizers in fast version tested

* Adding BertGenerationFast

* bump up setup.py for CI

* remove BertGenerationFast (too early)

* bump up tokenizers version

* Clean old docstrings

* Typo

* Update following Lysandre comments

Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
2020-10-08 11:32:16 +02:00
Piero Molino
4d04120c6d Replaced torch.load for loading the pretrained vocab of TransformerXL tokenizer to pickle.load (#6935)
* Replaced torch.load for loading the pretrained vocab of TransformerXL to pickle.load

* Replaced torch.save with pickle.dump when saving the vocabulary

* updating transformer-xl

* uploaded on S3 - compatibility

* fix tests

* style

* Address review comments

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-10-08 10:16:10 +02:00
Sam Shleifer
aba4e22944 [pseudolabels] cleanup markdown table (#7653) 2020-10-07 23:04:18 -04:00
Sam Shleifer
e3e6517355 Fix 3 failing slow bart/blender tests (#7652) 2020-10-07 22:05:03 -04:00
Sam Shleifer
960faaaf28 Blenderbot (#7418)
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-10-07 19:09:23 -04:00
Blaise Cruz
aee7967fc4 Added model cards for Tagalog BERT models (#7603) 2020-10-07 16:49:20 -04:00
Bobby Donchev
b1c06140f4 Create README.md for IsRoBERTa language model (#7640)
* Create README.md

* Update README.md

* Apply suggestions from code review

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-10-07 16:46:03 -04:00
Keshan
e10d389561 [Model card] SinhalaBERTo model. (#7558)
* [Model card] SinhalaBERTo model.

This is the model card for keshan/SinhalaBERTo model.

* Update model_cards/keshan/SinhalaBERTo/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-10-07 16:40:52 -04:00
Amine Abdaoui
167bce56f2 [model_card] bert-base-5lang-cased (#7573)
Co-authored-by: Amin <amin.geotrend@gmail.com>
2020-10-07 16:38:14 -04:00
Abed khooli
923dd4e5ef Create README.md (#7581) 2020-10-07 16:37:40 -04:00
dartrevan
85ead0fec4 Update README.md (#7590) 2020-10-07 16:37:10 -04:00
Ilias Chalkidis
c6b9c72eac Update README.md (#7629)
Minor changes: Add arxiv link + Layout improvement + fix typos
2020-10-07 16:36:08 -04:00
Abhilash Majumder
048b4bd2c6 Create Model Card For "abhilash1910/french-roberta" Model (#7544) 2020-10-07 16:35:28 -04:00
Julien Chaumond
c2e0d8ac52 [model_card] nikokons/gpt2-greek
by @nikkon3
2020-10-07 16:28:47 -04:00
Sam Shleifer
e2bb9abb6a [s2s] release pseudolabel links and instructions (#7639) 2020-10-07 11:20:44 -04:00
Sylvain Gugger
08ba4b4902 Trainer callbacks (#7596)
* Initial callback proposal

* Finish various callbacks

* Post-rebase conflicts

* Fix tests

* Don't use something that's not set

* Documentation

* Remove unwanted print.

* Document all models can work

* Add tests + small fixes

* Update docs/source/internal/trainer_utils.rst

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments

* Fix TF tests

* Real fix this time

* This one should work

* Fix typo

* Really fix typo

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-10-07 10:50:21 -04:00
Lysandre Debut
8fa0c956b3 Add GPT2 to sequence classification auto model (#7630) 2020-10-07 05:20:05 -04:00
Gabriele Picco
e084089eb9 Fix tokenizer UnboundLocalError when padding is set to PaddingStrategy.MAX_LENGTH (#7610)
* Fix UnboundLocalError when PaddingStrategy is MAX_LENGTH

* Fix UnboundLocalError for TruncationStrategy
2020-10-06 18:16:00 -04:00
Philipp
adfe6ace88 Fix wrong reference name/filename in docstring (#7616)
Resolves: #7613
2020-10-06 18:02:29 -04:00
Lysandre
f0d20ad328 Fix-copies 2020-10-06 23:44:03 +02:00
Lysandre Debut
5982431814 Add GPT2ForSequenceClassification based on DialogRPT (#7501)
* Add GPT2ForSequenceClassification based on DialogRPT

* Better documentation

* Code quality
2020-10-06 17:31:21 -04:00
Sam Shleifer
500be01c5d [s2s] save first batch to json for debugging purposes (#6810) 2020-10-06 16:11:56 -04:00
Sam Shleifer
2b574e7c60 [bart] fix config.classif_dropout (#7593) 2020-10-06 11:33:51 -04:00
Ahmed Elnaggar
aa6c3c14b4 typo fix (#7611)
It should be T5-3B not T5-3M.
2020-10-06 15:32:52 +02:00
Adrien David-Sivelle
98fb718577 Docker GPU Images: Add NVIDIA/apex to the cuda images with pytorch (#7598)
- Use cuda:10.2 image instead of 10.1 (to address version mismatch
  warning with pytorch)
- Use devel version that is built on the runtime and includes headers
  and development tools (was otherwise failing to build apex)
2020-10-06 15:23:32 +02:00
George Mihaila
4d541f516f fix return dicitonary labels from masked_lm_labels to labels (#7595) 2020-10-06 09:12:04 -04:00
cedspam
8d2c248df7 Update README.md (#7612) 2020-10-06 08:46:55 -04:00
Ilias Chalkidis
1c80b2c604 Create README.md (LEGAL-BERT Model card) (#7607)
* Create README.md

Model description for all LEGAL-BERT models, published as part of  "LEGAL-BERT: The Muppets straight out of Law School". Chalkidis et al., 2018, In Findings of EMNLP 2020

* Update model_cards/nlpaueb/legal-bert-base-uncased/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-10-06 08:46:17 -04:00
Siddharth Jain
eda27f4494 [TF generation] Fix typo (#7582)
* Fixing top_k and min_length assertions, and a typo fix

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-10-06 12:47:16 +02:00
Lysandre Debut
0257992e4a Fix squeezebert docs (#7587)
* Configuration

* Modeling

* Tokenization

* Obliterate the trailing spaces

* From underlines to long underlines
2020-10-06 06:22:04 -04:00
Ahmed Elnaggar
66c72082d0 Add ProtT5-XL-BFD model card (#7606)
* Add ProtT5-XL-BFD model card

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-10-06 12:19:21 +02:00
Stas Bekman
b21a30bdd8 [makefile] check only .py files (#7588)
* check only .py files

* better choice of words
2020-10-06 05:25:21 -04:00
Sam Shleifer
d5d2744aa7 Support T5 Distillation w/hidden state supervision (#7599) 2020-10-05 21:31:48 -04:00
Lysandre Debut
818c294fdd The toggle actually sticks (#7586) 2020-10-05 11:23:57 -04:00
Sylvain Gugger
03835af700 Documentation fixes (#7585) 2020-10-05 11:01:03 -04:00
Julien Plu
9cf7b23b9b Custom TF weights loading (#7422)
* First try

* Fix TF utils

* Handle authorized unexpected keys when loading weights

* Add several more authorized unexpected keys

* Apply style

* Fix test

* Address Patrick's comments.

* Update src/transformers/modeling_tf_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply style

* Make return_dict the default behavior and display a warning message

* Revert

* Replace wrong keyword

* Revert code

* Add forgot key

* Fix bug in loading PT models from a TF one.

* Fix sort

* Add a test for custom load weights in BERT

* Apply style

* Remove unused import

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-10-05 09:58:45 -04:00
Sylvain Gugger
d3adb985d1 Expand test to locate flakiness (#7580) 2020-10-05 09:45:47 -04:00
Sylvain Gugger
b2b7fc7814 Check and update model list in index.rst automatically (#7527)
* Check and update model list in index.rst automatically

* Check and update model list in index.rst automatically

* Adapt template
2020-10-05 09:40:45 -04:00
Sylvain Gugger
ca05c2a47d Fix post_init of some TrainingArguments (#7525) 2020-10-05 09:19:16 -04:00
Sylvain Gugger
3bd3d8b549 Add new dummy PT objects 2020-10-05 09:13:47 -04:00
Sylvain Gugger
28d183c90c Allow soft dependencies in the namespace with ImportErrors at use (#7537)
* PoC on RAG

* Format class name/obj name

* Better name in message

* PoC on one TF model

* Add PyTorch and TF dummy objects + script

* Treat scikit-learn

* Bad copy pastes

* Typo
2020-10-05 09:12:04 -04:00
Joshua H
1a00f46c74 Update Code example according to deprecation of AutoModeWithLMHead (#7555)
'The class `AutoModelWithLMHead` is deprecated and will be removed in a future version. Please use `AutoModelForCausalLM` for causal language models, `AutoModelForMaskedLM` for masked language models and `AutoModelForSeq2SeqLM` for encoder-decoder models.'
I dont know how to change the 'How to use this model directly from the 🤗/transformers library:' part since it is not part of the model-paper
2020-10-05 08:21:21 -04:00
Amine Abdaoui
0d79de7322 docs(pretrained_models): fix num parameters (#7575)
* docs(pretrained_models): fix num parameters

* fix(pretrained_models): correct typo

Co-authored-by: Amin <amin.geotrend@gmail.com>
2020-10-05 07:50:56 -04:00
Malte Pietsch
ba5ea66e30 Fix tokenization in SQuAD for RoBERTa, Longformer, BART (#7387)
* fix squad tokenization for roberta & co

* change to pure type based check

* sort imports
2020-10-05 06:34:13 -04:00
Sylvain Gugger
0270256b27 Allow nested tensors in predicted logits (#7542) 2020-10-05 06:33:15 -04:00
Cola
60de910e60 Add power argument for TF PolynomialDecay (#5732)
* 🚩 Add `power` argument for TF PolynomialDecay

* 🚩 Create default optimizer with power

* 🚩 Add argument to training args

* 🚨 Clean code format

* 🚨 Fix black warning

* 🚨 Fix code format
2020-10-05 05:16:29 -04:00
Lysandre Debut
41c3a3b98e Add Electra unexpected keys (#7569) 2020-10-05 04:49:39 -04:00
Nathan Cooper
071970feb8 [Model card] Java Code Summarizer model (#7568)
* Create README.md

* Update model_cards/ncoop57/bart-base-code-summarizer-java-v0/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-10-05 04:49:17 -04:00
Forrest Iandola
02ef825be2 SqueezeBERT architecture (#7083)
* configuration_squeezebert.py

thin wrapper around bert tokenizer

fix typos

wip sb model code

wip modeling_squeezebert.py. Next step is to get the multi-layer-output interface working

set up squeezebert to use BertModelOutput when returning results.

squeezebert documentation

formatting

allow head mask that is an array of [None, ..., None]

docs

docs cont'd

path to vocab

docs and pointers to cloud files (WIP)

line length and indentation

squeezebert model cards

formatting of model cards

untrack modeling_squeezebert_scratchpad.py

update aws paths to vocab and config files

get rid of stub of NSP code, and advise users to pretrain with mlm only

fix rebase issues

redo rebase of modeling_auto.py

fix issues with code formatting

more code format auto-fixes

move squeezebert before bert in tokenization_auto.py and modeling_auto.py because squeezebert inherits from bert

tests for squeezebert modeling and tokenization

fix typo

move squeezebert before bert in modeling_auto.py to fix inheritance problem

disable test_head_masking, since squeezebert doesn't yet implement head masking

fix issues exposed by the test_modeling_squeezebert.py

fix an issue exposed by test_tokenization_squeezebert.py

fix issue exposed by test_modeling_squeezebert.py

auto generated code style improvement

issue that we inherited from modeling_xxx.py: SqueezeBertForMaskedLM.forward() calls self.cls(), but there is no self.cls, and I think the goal was actually to call self.lm_head()

update copyright

resolve failing 'test_hidden_states_output' and remove unused encoder_hidden_states and encoder_attention_mask

docs

add integration test. rename squeezebert-mnli --> squeezebert/squeezebert-mnli

autogenerated formatting tweaks

integrate feedback from patrickvonplaten and sgugger to programming style and documentation strings

* tiny change to order of imports
2020-10-05 04:25:43 -04:00
Sylvain Gugger
e2c935f561 Cleanup documentation for BART, Marian, MBART and Pegasus (#7523)
* Cleanup documentation for BART, Marian, MBART and Pegasus

* Cleanup documentation for BART, Marian, MBART and Pegasus
2020-10-05 04:22:12 -04:00