Commit Graph

628 Commits

Author SHA1 Message Date
Lysandre
b43c78e5d3 Release: v2.11.0 2020-06-02 09:49:09 -04:00
Julien Chaumond
d4c2cb402d Kill model archive maps (#4636)
* Kill model archive maps

* Fixup

* Also kill model_archive_map for MaskedBertPreTrainedModel

* Unhook config_archive_map

* Tokenizers: align with model id changes

* make style && make quality

* Fix CI
2020-06-02 09:39:33 -04:00
Patrick von Platen
47a551d17b [pipeline] Tokenizer should not add special tokens for text generation (#4686)
* allow to not add special tokens

* remove print
2020-06-02 11:03:46 +02:00
Funtowicz Morgan
f6d5046af1 Override get_vocab for fast tokenizer. (#4717) 2020-06-02 11:02:27 +02:00
Sylvain Gugger
7677936316 Make docstring match args (#4711) 2020-06-01 15:22:51 -04:00
Lysandre
6449c494d0 close #4685 2020-06-01 12:57:52 -04:00
Julien Chaumond
ec8717d5d8 [config] Ensure that id2label always takes precedence over num_labels 2020-06-01 16:54:55 +02:00
Julien Chaumond
751a1e0890 [config] Ensure that id2label always takes precedence over num_labels
Fixes bug reported in https://github.com/huggingface/transformers/issues/4669

See #3967 for context
2020-06-01 16:25:56 +02:00
Rens
ec62b7d953 Fix onnx export input names order (#4641)
* pass on tokenizer to pipeline

* order input names when convert to onnx

* update style

* remove unused imports

* make ordered inputs list needs to be mutable

* add test custom bert model

* remove unused imports
2020-06-01 16:12:48 +02:00
Patrick von Platen
0866669e75 [EncoderDecoder] Fix initialization and save/load bug (#4680)
* fix bug

* add more tests
2020-05-30 01:25:19 +02:00
Wei Fang
33b7532e69 Fix longformer attention mask type casting when using apex (#4574)
* Fix longformer attention mask casting when using apex

* remove extra type casting
2020-05-29 18:13:30 +02:00
Patrick von Platen
56ee2560be [Longformer] Better handling of global attention mask vs local attention mask (#4672)
* better api

* improve automatic setting of global attention mask

* fix longformer bug

* fix global attention mask in test

* fix global attn mask flatten

* fix slow tests

* update docstring

* update docs and make more robust

* improve attention mask
2020-05-29 17:58:42 +02:00
Simon Böhm
e2230ba77b Fix BERT example code for NSP and Multiple Choice (#3953)
Change the example code to use encode_plus since the token_type_id
wasn't being correctly set.
2020-05-29 11:55:55 -04:00
Zhangyx
3a5d1ea2a5 Fix two bugs: 1. Index of test data of SST-2. 2. Label index of MNLI data. (#4546) 2020-05-29 11:12:24 -04:00
Patrick von Platen
9c17256447 [Longformer] Multiple choice for longformer (#4645)
* add multiple choice for longformer

* add models to docs

* adapt docstring

* add test to longformer

* add longformer for mc in init and modeling auto

* fix tests
2020-05-29 13:46:08 +02:00
Iz Beltagy
91487cbb8e [Longformer] fix model name in examples (#4653)
* fix longformer model names in examples

* a better name for the notebook
2020-05-29 13:12:35 +02:00
flozi00
b5015a2a0f gpt2 typo (#4629)
* gpt2 typo

* Add files via upload
2020-05-28 16:44:43 -04:00
Anthony MOI
5e737018e1 Fix add_special_tokens on fast tokenizers (#4531) 2020-05-28 10:54:45 -04:00
Suraj Patil
e444648a30 LongformerForTokenClassification (#4638) 2020-05-28 12:48:18 +02:00
Iz Beltagy
ef03ae874f [Longformer] more models + model cards (#4628)
* adding freeze roberta models

* model cards

* lint
2020-05-28 11:11:05 +02:00
Patrick von Platen
96f57c9ccb [Benchmark] Memory benchmark utils (#4198)
* improve memory benchmarking

* correct typo

* fix current memory

* check torch memory allocated

* better pytorch function

* add total cached gpu memory

* add total gpu required

* improve torch gpu usage

* update memory usage

* finalize memory tracing

* save intermediate benchmark class

* fix conflict

* improve benchmark

* improve benchmark

* finalize

* make style

* improve benchmarking

* correct typo

* make train function more flexible

* fix csv save

* better repr of bytes

* better print

* fix __repr__ bug

* finish plot script

* rename plot file

* delete csv and small improvements

* fix in plot

* fix in plot

* correct usage of timeit

* remove redundant line

* remove redundant line

* fix bug

* add hf parser tests

* add versioning and platform info

* make style

* add gpu information

* ensure backward compatibility

* finish adding all tests

* Update src/transformers/benchmark/benchmark_args.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/benchmark/benchmark_args_utils.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* delete csv files

* fix isort ordering

* add out of memory handling

* add better train memory handling

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-05-27 23:22:16 +02:00
Suraj Patil
ec4cdfdd05 LongformerForSequenceClassification (#4580)
* LongformerForSequenceClassification

* better naming x=>hidden_states, fix typo in doc

* Update src/transformers/modeling_longformer.py

* Update src/transformers/modeling_longformer.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-05-27 22:30:00 +02:00
Lysandre Debut
6a17688021 per_device instead of per_gpu/error thrown when argument unknown (#4618)
* per_device instead of per_gpu/error thrown when argument unknown

* [docs] Restore examples.md symlink

* Correct absolute links so that symlink to the doc works correctly

* Update src/transformers/hf_argparser.py

Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* Warning + reorder

* Docs

* Style

* not for squad

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-05-27 11:36:55 -04:00
Patrick von Platen
003c477129 [GPT2, CTRL] Allow input of input_ids and past of variable length (#4581)
* revert convenience  method

* clean docs a bit
2020-05-26 19:43:58 +02:00
Bram Vanroy
8cc6807e89 Make transformers-cli cross-platform (#4131)
* make transformers-cli cross-platform

Using "scripts" is a useful option in setup.py particularly when you want to get access to non-python scripts. However, in this case we want to have an entry point into some of our own Python scripts. To do this in a concise, cross-platfom way, we can use entry_points.console_scripts. This change is necessary to provide the CLI on different platforms, which "scripts" does not ensure. Usage remains the same, but the "transformers-cli" script has to be moved (be part of the library) and renamed (underscore + extension)

* make style & quality
2020-05-26 10:00:51 -04:00
Patrick von Platen
c589eae2b8 [Longformer For Question Answering] Conversion script, doc, small fixes (#4593)
* add new longformer for question answering model

* add new config as well

* fix links

* fix links part 2
2020-05-26 14:58:47 +02:00
ZhuBaohe
a163c9ca5b [T5] Fix Cross Attention position bias (#4499)
* fix

* fix1
2020-05-26 08:57:24 -04:00
ZhuBaohe
1d69028989 fix (#4410) 2020-05-26 08:51:28 -04:00
Sam Shleifer
b86e42e0ac [ci] fix 3 remaining slow GPU failures (#4584) 2020-05-25 19:20:50 -04:00
Patrick von Platen
3e3e552125 [Reformer] fix reformer num buckets (#4564)
* fix reformer num buckets

* fix

* adapt docs

* set num buckets in config
2020-05-25 16:04:45 -04:00
Elman Mansimov
3dea40b858 fixing tokenization of extra_id symbols in T5Tokenizer. Related to issue 4021 (#4353) 2020-05-25 16:04:30 -04:00
Suraj Patil
5139733623 LongformerTokenizerFast (#4547) 2020-05-25 16:03:55 -04:00
Sho Arora
adab7f8332 Add nn.Module as superclass (#4533) 2020-05-25 15:29:33 -04:00
Suraj Patil
03d8527de0 Longformer for question answering (#4500)
* added LongformerForQuestionAnswering

* add LongformerForQuestionAnswering

* fix import for LongformerForMaskedLM

* add LongformerForQuestionAnswering

* hardcoded sep_token_id

* compute attention_mask if not provided

* combine global_attention_mask with attention_mask when provided

* update example in  docstring

* add assert error messages, better attention combine

* add test for longformerForQuestionAnswering

* typo

* cast gloabl_attention_mask to long

* make style

* Update src/transformers/configuration_longformer.py

* Update src/transformers/configuration_longformer.py

* fix the code quality

* Merge branch 'longformer-for-question-answering' of https://github.com/patil-suraj/transformers into longformer-for-question-answering

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-05-25 18:43:36 +02:00
Bharat Raghunathan
a34a9896ac DOC: Fix typos in modeling_auto (#4534) 2020-05-23 09:40:59 -04:00
Bijay Gurung
e19b978151 Add Type Hints to modeling_utils.py Closes #3911 (#3948)
* Add Type Hints to modeling_utils.py Closes #3911

Add Type Hints to methods in `modeling_utils.py`

Note: The coverage isn't 100%. Mostly skipped internal methods.

* Reformat according to `black` and `isort`

* Use typing.Iterable instead of Sequence

* Parameterize Iterable by its generic type

* Use typing.Optional when None is the default value

* Adhere to style guideline

* Update src/transformers/modeling_utils.py

* Update src/transformers/modeling_utils.py

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-05-22 19:10:22 -04:00
Funtowicz Morgan
996f393a86 Warn the user about max_len being on the path to be deprecated. (#4528)
* Warn the user about max_len being on the path to be deprecated.

* Ensure better backward compatibility when max_len is provided to a tokenizer.

* Make sure to override the parameter and not the actual instance value.

* Format & quality
2020-05-22 18:08:30 -04:00
Sam Shleifer
ab44630db2 [Summarization Pipeline]: Fix default tokenizer (#4506)
* Fix pipelines defaults bug

* one liner

* style
2020-05-22 17:49:45 -04:00
Julien Chaumond
2c1ebb8b50 Re-apply #4446 + add packaging dependency
As discussed w/ @lysandrejik

packaging is maintained by PyPA (the Python Packaging Authority), and should be lightweight and stable
2020-05-22 17:29:03 -04:00
Lysandre
e6aeb0d3e8 Style 2020-05-22 17:20:03 -04:00
Anthony MOI
35df911485 Fix convert_token_type_ids_from_sequences for fast tokenizers (#4503) 2020-05-22 12:45:10 -04:00
Lysandre
10d72390c0 Revert #4446 Since it introduces a new dependency
Some checks failed
GitHub-hosted runner / check_code_quality (push) Has been cancelled
2020-05-22 10:49:45 -04:00
Lysandre
e0db6bbd65 Release: v2.10.0 2020-05-22 10:37:44 -04:00
Frankie Liuzzi
bd6e301832 added functionality for electra classification head (#4257)
* added functionality for electra classification head

* unneeded dropout

* Test ELECTRA for sequence classification

* Style

Co-authored-by: Frankie <frankie@frase.io>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-05-22 09:48:21 -04:00
Lysandre
a086527727 Unused Union should not be imported 2020-05-21 09:42:47 -04:00
Lysandre Debut
9d2ce253de TPU hangs when saving optimizer/scheduler (#4467)
* TPU hangs when saving optimizer/scheduler

* Style

* ParallelLoader is not a DataLoader

* Style

* Addressing @julien-c's comments
2020-05-21 09:18:27 -04:00
Zhangyx
49296533ca Adds predict stage for glue tasks, and generate result files which can be submitted to gluebenchmark.com (#4463)
* Adds predict stage for glue tasks, and generate result files which could be submitted to gluebenchmark.com website.

* Use Split enum + always output the label name

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-05-21 09:17:44 -04:00
Cola
eacea530c1 🚨 Remove warning of deprecation (#4477)
Remove warning of deprecated overload of addcdiv_

Fix #4451
2020-05-20 16:48:29 -04:00
Julien Plu
fa2fbed3e5 Better None gradients handling in TF Trainer (#4469)
* Better None gradients handling

* Apply Style

* Apply Style
2020-05-20 16:46:21 -04:00
Oliver Åstrand
e708bb75bf Correct TF formatting to exclude LayerNorms from weight decay (#4448)
* Exclude LayerNorms from weight decay

* Include both formats of layer norm
2020-05-20 16:45:59 -04:00