Commit Graph

5808 Commits

Author SHA1 Message Date
Sam Shleifer
46509d1c19 [docs] remove sshleifer from issue-template :( (#8418) 2020-11-09 12:51:38 -05:00
Patrick von Platen
9c83b96e62 [Tests] Add Common Test for Training + Fix a couple of bugs (#8415)
* add training tests

* correct longformer

* fix docs

* fix some tests

* fix some more train tests

* remove ipdb

* fix multiple edge case model training

* fix funnel and prophetnet

* clean gpt models

* undo renaming of albert
2020-11-09 18:24:41 +01:00
Sylvain Gugger
52040517b8 Deprecate old data/metrics functions (#8420) 2020-11-09 12:10:09 -05:00
Stas Bekman
d4d1fbfc5a [fsmt convert script] fairseq broke chkpt data - fixing that (#8377)
* fairseq broke chkpt data - fixing that

* style

* support older bpecodes filenames - specifically "code" in iwslt14
2020-11-09 11:57:42 -05:00
Sylvain Gugger
5c766ecb50 Fix typo 2020-11-09 11:50:51 -05:00
Sylvain Gugger
908a28894c Add new token classification example (#8340)
* Add new token classification example

* Remove txt file

* Add test

* With actual testing done

* Less warmup is better

* Update examples/token-classification/run_ner_new.py

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Address review comments

* Fix test

* Make Lysandre happy

* Last touches and rename

* Rename in tests

* Address review comments

* More run_ner -> run_ner_old

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
2020-11-09 11:39:55 -05:00
Sylvain Gugger
c7cb1aa26c Bump tokenizers (#8419) 2020-11-09 11:32:10 -05:00
Stas Bekman
78d706f3ae [fsmt tokenizer] support lowercase tokenizer (#8389)
* support lowercase tokenizer

* fix arg pos
2020-11-09 10:41:39 -05:00
Shashank Gupta
1e2acd0dcf Bug fix for permutation language modelling (#8409) 2020-11-09 10:23:26 -05:00
Philip May
bf8625e70b add evaluate doc - trainer.evaluate returns 'epoch' from training (#8273)
* add evaluate doc

* fix style with utils/style.doc

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-11-09 09:00:59 -05:00
Sam Shleifer
ebde57acac examples/docs: caveat that PL examples don't work on TPU (#8309) 2020-11-09 08:55:22 -05:00
Julien Plu
76e7a44dee Fix some tooling for windows (#8359)
* Fix some tooling for windows

* Fix conflict

* Trigger CI
2020-11-09 13:50:38 +01:00
dartrevan
507dfb40c3 Update README.md (#8406) 2020-11-09 16:44:43 +08:00
smanjil
7247d0b4ea updating tag for exbert viz (#8408) 2020-11-09 16:43:55 +08:00
Stas Bekman
4ab5617b0b comet_ml temporary fix(#8410) 2020-11-09 16:36:06 +08:00
Sam Shleifer
e6d9cdaafe [s2s/distill] remove run_distiller.sh, fix xsum script (#8412) 2020-11-08 16:57:43 -05:00
Stas Bekman
66582492d3 [s2s test_finetune_trainer] failing multigpu test (#8400) 2020-11-08 16:45:40 -05:00
Stas Bekman
f62755a600 [s2s examples test] fix data path (#8398) 2020-11-08 16:44:18 -05:00
Jonathan Chang
4a53e8e9e4 Fix DataCollatorForWholeWordMask again (#8397) 2020-11-08 09:53:01 -05:00
Manav Rathod
610730998f fixed default labels for QA model (#8399) 2020-11-08 09:08:14 -05:00
Chengxi Guo
0b02489b2c Add gpt2-medium-chinese model card (#8402)
* Create README.md

* Update model_cards/mymusise/gpt2-medium-chinese/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-08 05:00:19 -05:00
Stas Bekman
187554366f fix md table (#8395) 2020-11-08 04:25:14 -05:00
Jonathan Chang
77a257fc21 Fix DataCollatorForWholeWordMask (#8379)
* Fix DataCollatorForWholeWordMask

* Replace all tensorize_batch in data_collator.py
2020-11-07 12:51:56 -05:00
Stas Bekman
517eaf460b [make] rewrite modified_py_files in python to be cross-platform (#8371)
* rewrite modified_py_files in python to be cross-platform

* try a different way to test for variable not being ""

* improve comment
2020-11-07 18:45:16 +01:00
Patrick von Platen
07708793f2 fix encoder outputs (#8368) 2020-11-06 21:03:25 +01:00
Yossi Synett
bc0d26d1de [All Seq2Seq model + CLM models that can be used with EncoderDecoder] Add cross-attention weights to outputs (#8071)
* Output cross-attention with decoder attention output

* Update src/transformers/modeling_bert.py

* add cross-attention for t5 and bart as well

* fix tests

* correct typo in docs

* add sylvains and sams comments

* correct typo

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-11-06 19:34:48 +01:00
hassoudi
30f2507a07 Update README.md (#8360)
Fix websitr address
2020-11-06 11:45:46 -05:00
Jonathan Chang
5807ba3fa9 Fix typo (#8351) 2020-11-06 11:19:41 -05:00
hassoudi
82146496b6 Update README.md (#8338)
fixes
2020-11-06 06:20:58 -05:00
ktrapeznikov
9e5c4d39ab Create README.md (#8312)
* Create README.md

* Update model_cards/ktrapeznikov/gpt2-medium-topic-news/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-06 06:19:59 -05:00
hasantanvir79
06ebc37967 Create README.md (#8255)
* Create README.md

Initial commit

* Updated Read me

Updated

* Apply suggestions from code review

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-06 03:34:24 -05:00
Karthik Uppuluri
41cd031cf2 Create README.md (#8169) 2020-11-06 03:26:07 -05:00
Karthik Uppuluri
f932ddeff5 Create README.md (#8170) 2020-11-06 03:25:52 -05:00
Karthik Uppuluri
08b92f78fa Create README.md (#8168)
* Create README.md

* Update README.md
2020-11-06 03:25:33 -05:00
Karthik Uppuluri
77d62e78b0 Create README.md (#8167)
* Create README.md

Telugu BERTU Readme file

* Update model_cards/kuppuluri/telugu_bertu/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-06 03:24:31 -05:00
Yifan Peng
dd6bfcaefb Create README.md (#8327) 2020-11-06 03:22:52 -05:00
smanjil
ddeecf08e6 german medbert model details (#8266)
* model details

* Apply suggestions from code review

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-06 03:21:13 -05:00
Jiaxin Pei
96baaafd34 Create README.md (#8258) 2020-11-06 03:19:12 -05:00
Stefan Schweter
185259c261 [model_cards] Update Italian BERT models and introduce new Italian XXL ELECTRA model 🎉 (#8343) 2020-11-06 03:17:03 -05:00
Manuel Romero
34bbf60bf8 Model card: GPT-2 fine-tuned on CommonGen (#8248) 2020-11-06 03:15:11 -05:00
Manuel Romero
973218fd3b Model card: CodeBERT fine-tuned for Insecure Code Detection (#8247)
* Model card: CodeBERT fine-tuned for Insecure Code Detection

* Update model_cards/mrm8488/codebert-base-finetuned-detect-insecure-code/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-06 03:13:45 -05:00
Manuel Romero
f833ca418b Model card: T5-base fine-tuned on QuaRel (#8334) 2020-11-06 03:09:55 -05:00
Stas Bekman
9edafaebef [s2s] test_bash_script.py - actually learn something (#8318)
* use decorator

* remove hardcoded paths

* make the test use more data and do real quality tests

* shave off 10 secs

* add --eval_beams 2, reformat

* reduce train size, use smaller custom dataset
2020-11-05 23:15:14 -05:00
Leandro von Werra
17450397a7 Docs bart training ref (#8330)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-11-05 17:20:57 -05:00
Stas Bekman
d787935a14 [s2s] test_distributed_eval (#8315)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-11-05 16:01:15 -05:00
Sylvain Gugger
04e442d575 Make Trainer evaluation handle dynamic seq_length (#8336)
* Make Trainer evaluation handle dynamic seq_length

* Document behavior.

* Fix test

* Better fix

* Fixes for realsies this time

* Address review comments

* Without forgetting to save...
2020-11-05 15:13:51 -05:00
Guillaume Filion
27b402cab0 Output global_attentions in Longformer models (#7562)
* Output global_attentions in Longformer models

* make style

* small refactoring

* fix tests

* make fix-copies

* add for tf as well

* remove comments in test

* make fix-copies

* make style

* add docs

* make docstring pretty

Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
2020-11-05 21:10:43 +01:00
Sam Shleifer
7abc1d96d1 no warn (#8329) 2020-11-05 11:42:24 -05:00
Bobby Donchev
52f44dd6d2 change TokenClassificationTask class methods to static methods (#7902)
* change TokenClassificationTask class methods to static methods

Since we do not require self in the class methods of TokenClassificationTask we should probably switch to static methods. Also, since the class TokenClassificationTask does not contain a constructor it is currently unusable as is. By switching to static methods this fixes the issue of having to document the intent of the broken class.

Also, since the get_labels and read_examples_from_file methods are ought to be implemented. Static method definitions are unchanged even after inheritance, which means that it can be overridden, similar to other class methods.

* Trigger Build

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-11-05 09:38:30 -05:00
Guillem García Subies
77c8f6c627 Corrected typo in readme (#8320) 2020-11-05 07:48:36 -05:00