Commit Graph

5917 Commits

Author SHA1 Message Date
Amine Abdaoui
0a80959bdd Add cards for all Geotrend models (#8617)
* docs(bert-base-15lang-cased): add model card

* add cards for all Geotrend models

* [model cards] fix language tag for all Geotrend models
2020-11-19 04:47:24 -05:00
cronoik
dcc9c64299 Updated the Extractive Question Answering code snippets (#8636)
* Updated the Extractive Question Answering code snippets

The Extractive Question Answering code snippets do not work anymore since the models return task-specific output objects. This commit fixes the pytorch and tensorflow examples but adding `.values()` to the model call.

* Update task_summary.rst
2020-11-18 18:56:47 -05:00
Tim Isbister
28d16e7ac5 Update README.md (#8635) 2020-11-18 18:35:23 -05:00
cronoik
b290195ac7 grammar (#8639) 2020-11-18 18:04:25 -05:00
Stas Bekman
d86d57faa3 [s2s] distillation apex breaks return_dict obj (#8631)
* apex breaks return_dict obj

* style
2020-11-18 12:51:29 -08:00
Perez Ogayo
bf3611b2ab Created ModelCard for Hel-ach-en MT model (#8496)
* Updated ModelCard

* Apply suggestions from code review

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-18 14:42:13 -05:00
Yifan Peng
c95b26a719 Create README.md (#8362) 2020-11-18 13:37:14 -05:00
Manuel Romero
fdbbb6c17a Model card: T5-base fine-tuned on QuaRTz (#8369)
* Model card: T5-base fine-tuned on QuaRTz

* Update model_cards/mrm8488/t5-base-finetuned-quartz/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-18 13:34:27 -05:00
Yifan Peng
6e6d24c5d8 Create README.md (#8363) 2020-11-18 13:33:04 -05:00
Divyanshu Kakwani
35fd3d64e3 Add model card for ai4bharat/indic-bert (#8464) 2020-11-18 13:28:49 -05:00
dartrevan
38f01dfe03 Update README.md (#8405)
* Update README.md

* Update README.md
2020-11-18 13:23:08 -05:00
Abhilash Majumder
2d8fbf012a Model Card for abhilash1910/financial_roberta (#8625)
* Model Card for abhilash1910/financial_roberta

* Update model_cards/abhilash1910/financial_roberta/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-18 13:22:28 -05:00
Vishal Singh
26dc6593f3 Update README.md (#8544)
Modified Model in Action section. The class `AutoModelWithLMHead` is deprecated so changed it to `AutoModelForSeq2SeqLM` for encoder-decoder models. Removed duplicate eos token.
2020-11-18 13:19:32 -05:00
smanjil
6c8fad4f0d replace performance table with markdown (#8565)
* replace performance table with markdown

* Update model_cards/smanjil/German-MedBERT/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-18 13:17:46 -05:00
hhou435
e7f77fc52a model_cards for Chinese Couplet and Poem GPT2 models (#8620) 2020-11-18 13:06:30 -05:00
Sylvain Gugger
a0c62d2493 Fix training from scratch in new scripts (#8623) 2020-11-18 12:15:26 -05:00
Sylvain Gugger
1e62e999e8 Fixes the training resuming with gradient accumulation (#8624) 2020-11-18 12:00:11 -05:00
Patrick von Platen
cdfa56afe0 [Tokenizer Doc] Improve tokenizer summary (#8622)
* improve summary

* small fixes

* cleaned line length

* correct "" formatting

* apply sylvains suggestions
2020-11-18 17:14:15 +01:00
Nicola De Cao
2f9d49b389 Adding PrefixConstrainedLogitsProcessor (#8529)
* Adding PrefixConstrainedLogitsProcessor

* fixing RAG and style_doc

* fixing black (v20 instead of v19)

* Improving doc in generation_logits_process.py

* Improving docs and typing in generation_utils.py

* docs improvement

* adding test and fixing doc typo

* fixing doc_len

* isort on test

* fixed test

* improve docstring a bit

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-11-18 17:06:25 +01:00
Julien Plu
3bc1540070 New TF loading weights (#8490)
* New TF loading weights

* apply style

* Better naming

* Largely comment the loading method

* Apply style

* Address Patrick's comments

* Remove useless line of code

* Update Docstring

* Address Sylvain's and Lysandre's comments

* Simplify the names computation

* Typos
2020-11-18 10:48:31 -05:00
Ratthachat (Jung)
0df91ee4f7 self.self.activation_dropout -> self.activation_dropout (#8611)
(one line typo)
2020-11-18 10:30:29 -05:00
Stas Bekman
cdf1b7ae82 fix to adjust for #8530 changes (#8612) 2020-11-18 10:25:00 -05:00
Stas Bekman
2819da02f7 [s2s] broken test (#8613) 2020-11-18 10:15:53 -05:00
Michał Pogoda
9fa3ed1a7f Fix missing space in multiline warning (#8593)
Multiline string informing about missing PyTorch/TensorFlow had missing space.
2020-11-18 10:09:26 -05:00
Sylvain Gugger
8fcb6935a1 Fix DataCollatorForLanguageModeling (#8621) 2020-11-18 10:02:50 -05:00
Benjamin Minixhofer
f6fe41c96b Reset loss to zero on logging in Trainer to avoid bfloat16 issues (#8561)
* make tr_loss regular float

* Revert "make tr_loss regular float"

This reverts commit c9d7ccfaf0c4387187b0841694f01ec0ffd5f4ba.

* reset loss at each logging step

* keep track of total loss with _total_loss_scalar

* add remaining tr_loss at the end
2020-11-18 09:58:08 -05:00
cronoik
b592728eff Fixed link to the wrong paper. (#8607) 2020-11-17 19:00:44 -05:00
Sylvain Gugger
0512444ee5 Remove old doc 2020-11-17 17:34:25 -05:00
Caitlin Ostroff
5cf9c79665 Add Harry Potter Model Card (#8605)
* Add Harry Potter Model

* Update model_cards/ceostroff/harry-potter-gpt2-fanfiction/README.md

* Update model_cards/ceostroff/harry-potter-gpt2-fanfiction/README.md

* Update model_cards/ceostroff/harry-potter-gpt2-fanfiction/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-17 16:50:58 -05:00
Sylvain Gugger
dd52804f5f Remove deprecated (#8604)
* Remove old deprecated arguments

Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

* Remove needless imports

* Fix tests

Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
2020-11-17 15:11:29 -05:00
Lysandre Debut
3095ee9dab Tokenizers should be framework agnostic (#8599)
* Tokenizers should be framework agnostic

* Run the slow tests

* Not testing

* Fix documentation

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-11-17 14:03:03 -05:00
Sylvain Gugger
7f3b41a306 Fix check repo utils (#8600) 2020-11-17 14:01:46 -05:00
Stas Bekman
f0435f5a61 these should run fine on multi-gpu (#8582) 2020-11-17 14:00:41 -05:00
Sylvain Gugger
36a19915ea Fix model templates (#8595)
* First fixes

* Fix imports and add init

* Fix typo

* Move init to final dest

* Fix tokenization import

* More fixes

* Styling
2020-11-17 10:35:38 -05:00
Julien Chaumond
042a6aa777 Tokenizers: ability to load from model subfolder (#8586)
* <small>tiny typo</small>

* Tokenizers: ability to load from model subfolder

* use subfolder for local files as well

* Uniformize model shortcut name => model id

* from s3 => from huggingface.co

Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>
2020-11-17 08:58:45 -05:00
Sylvain Gugger
48395d6b8e Fix init for MT5 (#8591) 2020-11-17 08:52:13 -05:00
sgugger
a6cf9ca00b Add __init__ to the models folder 2020-11-17 07:39:37 -05:00
Patrick von Platen
5104223552 [MT5] More docs (#8589)
* add docs

* make style
2020-11-17 12:47:57 +01:00
Patrick von Platen
86822a358b T5 & mT5 (#8552)
* add mt5 and t5v1_1 model

* fix tests

* correct some imports

* add tf model

* finish tf t5

* improve examples

* fix copies

* clean doc
2020-11-17 12:23:09 +01:00
fajri91
9e01f988dd model_card for indolem/indobert-base-uncased (#8579) 2020-11-17 03:36:50 -05:00
Sylvain Gugger
c89bdfbe72 Reorganize repo (#8580)
* Put models in subfolders

* Styling

* Fix imports in tests

* More fixes in test imports

* Sneaky hidden imports

* Fix imports in doc files

* More sneaky imports

* Finish fixing tests

* Fix examples

* Fix path for copies

* More fixes for examples

* Fix dummy files

* More fixes for example

* More model import fixes

* Is this why you're unhappy GitHub?

* Fix imports in conver command
2020-11-16 21:43:42 -05:00
Julien Plu
901507335f Fix mixed precision issue for GPT2 (#8572)
* Fix mixed precision issue for GPT2

* Forgot one cast

* oops

* Forgotten casts
2020-11-16 14:44:19 -05:00
Sylvain Gugger
1073a2bde5 Switch return_dict to True by default. (#8530)
* Use the CI to identify failing tests

* Remove from all examples and tests

* More default switch

* Fixes

* More test fixes

* More fixes

* Last fixes hopefully

* Use the CI to identify failing tests

* Remove from all examples and tests

* More default switch

* Fixes

* More test fixes

* More fixes

* Last fixes hopefully

* Run on the real suite

* Fix slow tests
2020-11-16 11:43:00 -05:00
Sylvain Gugger
0d0a0785fd Update version to v4.0.0-dev (#8568) 2020-11-16 10:21:19 -05:00
LSinev
afb50c663a Fix GPT2DoubleHeadsModel to work with model.generate() (#6601)
* Fix passing token_type_ids during GPT2DoubleHeadsModel.generate() if used

and for GPT2LMHeadModel too

* Update tests to check token_type_ids usage in GPT2 models
2020-11-16 14:35:44 +01:00
Yusuke Mori
04d8136bde Adding the prepare_seq2seq_batch function to ProphetNet (#8515)
* Simply insert T5Tokenizer's prepare_seq2seq_batch

* Update/Add some 'import'

* fix RunTimeError caused by '.view'

* Moves .view related error avoidance from seq2seq_trainer to inside prophetnet

* Update test_tokenization_prophetnet.py

* Format the test code with black

* Re-format the test code

* Update test_tokenization_prophetnet.py

* Add importing require_torch in the test code

* Add importing BatchEncoding in the test code

* Re-format the test code on Colab
2020-11-16 14:18:25 +01:00
Stas Bekman
931b10978e [doc] typo fix (#8535)
* [doc] typo fix

@sgugger

* Update src/transformers/modeling_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-11-16 08:05:30 -05:00
Branden Chan
6db21a06ae Clearer Model Versioning Example (#8562) 2020-11-16 06:59:10 -05:00
Mehrdad Farahani
daaa68451e Readme for Wiki Summary [Persian] bert2bert (#8558) 2020-11-16 05:04:46 -05:00
Mehrdad Farahani
06d468d3f0 Readme for News Headline Generation (bert2bert) (#8557) 2020-11-16 05:04:38 -05:00