Commit Graph

521 Commits

Author SHA1 Message Date
Lysandre
e7cfc1a313 Release: v2.9.0
Some checks failed
GitHub-hosted runner / check_code_quality (push) Has been cancelled
2020-05-07 14:15:20 -04:00
Julien Chaumond
0ae96ff8a7 BIG Reorganize examples (#4213)
* Created using Colaboratory

* [examples] reorganize files

* remove run_tpu_glue.py as superseded by TPU support in Trainer

* Bugfix: int, not tuple

* move files around
2020-05-07 13:48:44 -04:00
Julien Chaumond
cafa6a9e29 [Trainer] Ability to specify optimizer/scheduler at init
cc @patrickvonplaten @thomwolf
2020-05-07 11:25:26 -04:00
Bram Vanroy
e4fd5e3999 Use with_extension to change the extension (#4203)
As per https://github.com/huggingface/transformers/pull/3934#discussion_r421307659
2020-05-07 11:14:56 -04:00
Lysandre Debut
ebf80e2e70 Tpu trainer (#4146)
* wip

* wip

* a last wip

* Better logging when using TPUs

* Correct argument name

* Tests

* fix

* Metrics in evaluation

* Update src/transformers/training_args.py

* [tpu] Use launcher script instead

* [tpu] lots of tweaks

* Fix formatting

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-05-07 10:34:04 -04:00
Funtowicz Morgan
026097b9ee Ensure fast tokenizer can construct tensor without pad token if only one sample is provided. (#4201) 2020-05-07 10:02:53 -04:00
Funtowicz Morgan
0a6cbea0a5 Rewritten batch support in pipelines. (#4154)
* Rewritten batch support in pipelines.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Fix imports sorting 🔧

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Set pad_to_max_length=True by default on Pipeline.

* Set pad_to_max_length=False for generation pipelines.

Most of generation models doesn't have padding token.

* Address @joeddav review comment: Uniformized *args.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Address @joeddav review comment: Uniformized *args (second).

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
2020-05-07 09:52:40 -04:00
Patrick von Platen
99d1a69444 fix examples (#4192) 2020-05-07 10:54:48 +02:00
Patrick von Platen
74ffc9ea6b [Reformer] Fix example and error message (#4191)
* fix example reformer

* fix error message and example docstring

* improved error message
2020-05-07 10:50:11 +02:00
Patrick von Platen
96c78396ce fix docstring reformer (#4190) 2020-05-07 10:28:31 +02:00
Patrick von Platen
dca34695d0 Reformer (#3351)
* first copy & past commit from Bert and morgans LSH code

* add easy way to compare to trax original code

* translate most of function

* make trax lsh self attention deterministic with numpy seed + copy paste code

* add same config

* add same config

* make layer init work

* implemented hash_vectors function for lsh attention

* continue reformer translation

* hf LSHSelfAttentionLayer gives same output as trax layer

* refactor code

* refactor code

* refactor code

* refactor

* refactor + add reformer config

* delete bogus file

* split reformer attention layer into two layers

* save intermediate step

* save intermediate step

* make test work

* add complete reformer block layer

* finish reformer layer

* implement causal and self mask

* clean reformer test and refactor code

* fix merge conflicts

* fix merge conflicts

* update init

* fix device for GPU

* fix chunk length init for tests

* include morgans optimization

* improve memory a bit

* improve comment

* factorize num_buckets

* better testing parameters

* make whole model work

* make lm model work

* add t5 copy paste tokenizer

* add chunking feed forward

* clean config

* add improved assert statements

* make tokenizer work

* improve test

* correct typo

* extend config

* add complexer test

* add new axial position embeddings

* add local block attention layer

* clean tests

* refactor

* better testing

* save intermediate progress

* clean test file

* make shorter input length work for model

* allow variable input length

* refactor

* make forward pass for pretrained model work

* add generation possibility

* finish dropout and init

* make style

* refactor

* add first version of RevNet Layers

* make forward pass work and add convert file

* make uploaded model forward pass work

* make uploaded model forward pass work

* refactor code

* add namedtuples and cache buckets

* correct head masks

* refactor

* made reformer more flexible

* make style

* remove set max length

* add attention masks

* fix up tests

* fix lsh attention mask

* make random seed optional for the moment

* improve memory in reformer

* add tests

* make style

* make sure masks work correctly

* detach gradients

* save intermediate

* correct backprob through gather

* make style

* change back num hashes

* rename to labels

* fix rotation shape

* fix detach

* update

* fix trainer

* fix backward dropout

* make reformer more flexible

* fix conflict

* fix

* fix

* add tests for fixed seed in reformer layer

* fix trainer typo

* fix typo in activations

* add fp16 tests

* add fp16 training

* support fp16

* correct gradient bug in reformer

* add fast gelu

* re-add dropout for embedding dropout

* better naming

* better naming

* renaming

* finalize test branch

* finalize tests

* add more tests

* finish tests

* fix

* fix type trainer

* fix fp16 tests

* fix tests

* fix tests

* fix tests

* fix issue with dropout

* fix dropout seeds

* correct random seed on gpu

* finalize random seed for dropout

* finalize random seed for dropout

* remove duplicate line

* correct half precision bug

* make style

* refactor

* refactor

* docstring

* remove sinusoidal position encodings for reformer

* move chunking to modeling_utils

* make style

* clean config

* make style

* fix tests

* fix auto tests

* pretrained models

* fix docstring

* update conversion file

* Update pretrained_models.rst

* fix rst

* fix rst

* update copyright

* fix test path

* fix test path

* fix small issue in test

* include reformer in generation tests

* add docs for axial position encoding

* finish docs

* Update convert_reformer_trax_checkpoint_to_pytorch.py

* remove isort

* include sams comments

* remove wrong comment in utils

* correct typos

* fix typo

* Update reformer.rst

* applied morgans optimization

* make style

* make gpu compatible

* remove bogus file

* big test refactor

* add example for chunking

* fix typo

* add to README
2020-05-07 10:17:01 +02:00
Julien Plu
aad50151f3 TF version of the trainer (#4017)
* First commit to add a TF version of the trainer.

* Make the TF trainer closer to what looks the PT trainer

* Refactoring common code between the PT and TF trainer into an util file.

* Some bugfix + better similarity with the PT trainer

* Add missing class in transformers init

* Bugfix over prediction + use classification report instead of simple metrics

* Fix name error

* Fix optimization tests + style

* Apply style

* Several bugfix for multi-gpu training

* Apply style

* Apply style

* Add glue example for the TF trainer

* Several bugix + address the reviews

* Fix on the TF training args file

* Add a debug mode

* Bugfix in utils_ner.py when segment_ids is None

* Apply style

* Apply style

* Add TPU strategy

* Fix selection strategy
2020-05-06 12:56:52 -04:00
kumapo
9972562d33 Include ElectraPreTrainedModel into __init__ (#4173) 2020-05-06 12:00:23 -04:00
Patrick von Platen
a638e986f4 fix hard wired pad token id (#4138) 2020-05-06 00:42:34 +02:00
Julien Chaumond
fd2174664c [Trainer] W&B: Enable model watch
See https://github.com/huggingface/transformers/pull/3916
2020-05-05 10:59:23 -04:00
Boris Dayma
818463ee8e Trainer: add logging through Weights & Biases (#3916)
* feat: add logging through Weights & Biases

* feat(wandb): make logging compatible with all scripts

* style(trainer.py): fix formatting

* [Trainer] Tweak wandb integration

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-05-04 22:42:27 -04:00
jaymody
858b1d1e5a allow an already created tensorboard SummaryWriter be passed to Trainer 2020-05-04 19:58:24 -04:00
Lorenzo Ampil
6af3306a1d Add decoder specific error message for T5Stack.forward (#4128) 2020-05-03 12:40:08 +02:00
Zhiyu Lin
1cdd2ad2af Fix #2941 (#4109)
* Fix of issue #2941

Reshaped score array to avoid `numpy` ValueError.

* Update src/transformers/pipelines.py

* Update src/transformers/pipelines.py

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-05-02 11:20:30 -04:00
Julien Chaumond
27d55125e6 Configs: saner num_labels in configs. (#3967) 2020-05-01 11:28:55 -04:00
Julien Chaumond
b8686174be Merge pull request #3934 from huggingface/examples_args_from_files
[qol] example scripts: parse args from .args file or JSON
2020-04-30 22:40:13 -04:00
Suraj Parmar
8b5e5ebcf9 Continue training args and tqdm in notebooks (#3939)
* Continue training args

* Continue training args

* added explaination

* added explaination

* added explaination

* Fixed tqdm auto

* Update src/transformers/training_args.py

Co-Authored-By: Julien Chaumond <chaumond@gmail.com>

* Update src/transformers/training_args.py

* Update src/transformers/training_args.py

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-04-30 22:14:08 -04:00
Julien Chaumond
ab90353f1a [cli] {login, upload, s3} display more helpful error messages 2020-04-30 12:51:06 -04:00
Jordan
7f9193ef09 Fixed Style Inconsistency (#3976) 2020-04-30 14:33:09 +02:00
Jared T Nielsen
64070cbb88 Fix TF input docstrings to refer to tf.Tensor rather than torch.FloatTensor. (#4051) 2020-04-30 14:28:56 +02:00
Lysandre Debut
e73595bd64 Remove jitted method so that our models are pickable. (#4050) 2020-04-29 09:53:19 -04:00
Julien Chaumond
6faca88ee0 Align MarianMT with #4030
cc @sshleifer
2020-04-28 20:35:20 -04:00
Julien Chaumond
455c639093 CDN urls (#4030)
* [file_utils] use_cdn + documentation

* Move to cdn. urls for weights

* [urls] Hotfix for bert-base-japanese
2020-04-28 20:27:14 -04:00
Thomas Wolf
8ba4c5885f Allow a more backward compatible behavior of max_len_single_sentence and max_len_sentences_pair (#3994)
* Allow a more backward compatible behavior of max_len_single_sentence and max_len_sentences_pair and

* The style and quality are now top-notch
2020-04-29 01:13:59 +02:00
Sam Shleifer
847e7f3379 MarianMTModel.from_pretrained('Helsinki-NLP/opus-marian-en-de') (#3908)
Co-Authored-By: Stefan Schweter <stefan@schweter.it>
2020-04-28 18:22:37 -04:00
jazzcook15
c7d06b79ae Fix #3954 - GPT2 is not traceable (#3955)
* Update sqrt computation so it can survive a torch.jit.trace

* Update modeling_gpt2.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-04-28 21:18:56 +02:00
Patrick von Platen
9a0a8c1c6f add examples to doc (#4045) 2020-04-28 16:33:23 +02:00
Patrick von Platen
fa49b9afea Clean Encoder-Decoder models with Bart/T5-like API and add generate possibility (#3383)
* change encoder decoder style to bart & t5 style

* make encoder decoder generation dummy work for bert

* make style

* clean init config in encoder decoder

* add tests for encoder decoder models

* refactor and add last tests

* refactor and add last tests

* fix attn masks for bert encoder decoder

* make style

* refactor prepare inputs for Bert

* refactor

* finish encoder decoder

* correct typo

* add docstring to config

* finish

* add tests

* better naming

* make style

* fix flake8

* clean docstring

* make style

* rename
2020-04-28 15:11:09 +02:00
Patrick von Platen
180585741c [Generation] Generation should allow to start with empty prompt (#3993)
* fix empty prompt

* fix length in generation pipeline
2020-04-28 14:33:15 +02:00
sshleifer
41750a6cff Fix typos 2020-04-27 13:25:53 -04:00
Julien Chaumond
97a375484c rm boto3 dependency 2020-04-27 11:17:14 -04:00
Julien Chaumond
c53cc018de [Trainer] Fix _rotate_checkpoints
Close #3920
2020-04-23 23:59:43 +00:00
mneilly-et
77b75d2c78 Fix for #3873 to change type of exponent parameter for torch.pow() call from int to float (#3924) 2020-04-23 14:25:31 -04:00
Jared T Nielsen
a79a9e1241 Fix TFAlbertForSequenceClassification classifier dropout probability. It was set to config.hidden_dropout_prob, but should be config.classifier_dropout_prob. (#3928) 2020-04-23 13:18:16 -04:00
peterandluc
8e093e5981 Remove 50k limits bug 2020-04-23 11:15:09 -04:00
Julien Chaumond
6af5a54c28 [Trainer] reuse constant 2020-04-23 11:02:05 -04:00
Julien Chaumond
7c2a32ff88 [housekeeping] super() 2020-04-23 10:43:22 -04:00
Julien Chaumond
a946b6b51b [housekeeping] Upgrade # type Python 2 syntax
cc @sshleifer
2020-04-23 10:39:24 -04:00
Lorenzo Ampil
f16540fcba Pipeline for Text Generation: GenerationPipeline (#3758)
* Add GenerationPipeline

* Fix parameter names

* Correct parameter __call__ parameters

* Add model type attribute and correct function calls for prepare_input

* Take out trailing commas from init attributes

* Remove unnecessary tokenization line

* Implement support for multiple text inputs

* Apply generation support for multiple input text prompts

* Take out tensor coersion

* Take out batch index

* Add text prompt to return sequence

* Squeeze token tensore before decoding

* Return only a single list of sequences if only one prompt was used

* Correct results variable name

* Add GenerationPipeline to SUPPORTED_TASKS with the alias , initalized w GPT2

* Registedred AutoModelWithLMHead for both pt and t

* Update docstring for GenerationPipeline

* Add kwargs parameter to mode.generate

* Take out kwargs parameter after all

* Add generation pipeline example in pipeline docstring

* Fix max length by squeezing tokens tensor

* Apply ensure_tensor_on_device to pytorch tensor

* Include generation step in torch.no_grad

* Take out input from prepare_xlm_input and set 'en' as default xlm_language

* Apply framework specific encoding during prepare_input

* Format w make style

* Move GenerationPipeline import to follow proper import sorting

* Take out training comma from generation dict

* Apply requested changes

* Change name to TextGenerationPipeline

* Apply TextGenerationPipeline rename to __init___

* Changing alias to

* Set input mapping as input to ensure_tensor_on_device

* Fix assertion placement

* Add test_text_generation

* Add TextGenerationPipeline to PipelineCommonTests

* Take out whitespace

* Format __init__ w black

* Fix __init__ style

* Forman __init___

* Add line to end of __init__

* Correct model tokenizer set for test_text_generation

* Ensure to return list of list, not list of string (to pass test)

* Limit test models to only 3 to limit runtime to address circleCI timeout error

* Update src/transformers/pipelines.py

Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/pipelines.py

Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/pipelines.py

Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/pipelines.py

Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/pipelines.py

Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_pipelines.py

Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/pipelines.py

Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/pipelines.py

Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/pipelines.py

Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>

* Remove argument docstring, __init__, add additional __call__ arguments, and reformat results to list of dict

* Fix blank result list

* Add TextGenerationPipeline to pipelines.rst

* Update src/transformers/pipelines.py

Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/pipelines.py

Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>

* Fix typos from adding PADDING_TEXT_TOKEN_LENGTH

* Fix incorrectly moved result list

* Update src/transformers/pipelines.py

Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/pipelines.py

* Update src/transformers/pipelines.py

* Update src/transformers/pipelines.py

* Update src/transformers/pipelines.py

* Update src/transformers/pipelines.py

* Update src/transformers/pipelines.py

* Update src/transformers/pipelines.py

* Update src/transformers/pipelines.py

* Update src/transformers/pipelines.py

* Update src/transformers/pipelines.py

* Update src/transformers/pipelines.py

* Update src/transformers/pipelines.py

Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>

* Add back generation line and make style

* Take out blank whitespace

* Apply new alis, text-generation, to test_pipelines

* Fix text generation alias in test

* Update src/transformers/pipelines.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-04-22 09:37:03 -04:00
Julien Chaumond
dd9d483d03 Trainer (#3800)
* doc

* [tests] Add sample files for a regression task

* [HUGE] Trainer

* Feedback from @sshleifer

* Feedback from @thomwolf + logging tweak

* [file_utils] when downloading concurrently, get_from_cache will use the cached file for subsequent processes

* [glue] Use default max_seq_length of 128 like before

* [glue] move DataTrainingArguments around

* [ner] Change interface of InputExample, and align run_{tf,pl}

* Re-align the pl scripts a little bit

* ner

* [ner] Add integration test

* Fix language_modeling with API tweak

* [ci] Tweak loss target

* Don't break console output

* amp.initialize: model must be on right device before

* [multiple-choice] update for Trainer

* Re-align to 827d6d6ef0
2020-04-21 20:11:56 -04:00
Julien Chaumond
d32585a304 Fix Torch.hub + Integration test 2020-04-21 14:13:30 -04:00
Bharat Raghunathan
7d40901ce3 Fix Documentation issue in BertForMaskedLM forward (#3855) 2020-04-21 09:08:20 +02:00
Funtowicz Morgan
2c05b8a56c Remove tqdm logging when using pipelines. (#3833)
Introduce tqdm_enabled parameter on squad_convert_examples_to_features() default to True and set to False in QA pipelines.
2020-04-20 22:58:52 +02:00
Jared T Nielsen
c79b550dd0 Add qas_id to SquadResult and SquadExample (#3745)
* Add qas_id

* Fix incorrect name in squad.py

* Make output files optional for squad eval
2020-04-20 16:08:57 -04:00
Patrick von Platen
c4158a6314 [Pipelines] Encode to max length of input not max length of tokenizer for batch input (#3857)
* remove max_length = tokenizer.max_length when encoding

* make style
2020-04-20 14:39:16 -04:00