HuggingFace_transformer

Author	SHA1	Message	Date
Suraj Patil	a442f87adc	add LongformerTokenizerFast in AutoTokenizer (#6463 )	2020-08-13 12:06:43 -04:00
Lysandre Debut	f7cbc13db7	Test model outputs equivalence (#6445 ) * Test model outputs equivalence * Fix failing tests * From dict to kwargs * DistilBERT * Addressing @sgugger and @patrickvonplaten's comments	2020-08-13 11:59:35 -04:00
Prajjwal Bhargava	54c687e97c	typo fix (#6462 )	2020-08-13 09:36:48 -04:00
Zhu Baohe	9d94aecd51	Fix docs and bad word tokens generation_utils.py (#6387 ) * fix * fix2 * fix3	2020-08-13 13:12:16 +02:00
Joe Davison	bc820476a5	add targets arg to fill-mask pipeline (#6239 ) * add targets arg to fill-mask pipeline * add tests and more error handling * quality * update docstring	2020-08-12 12:48:29 -04:00
Patrick von Platen	0735def8e1	[EncoderDecoder] Add encoder-decoder for roberta/ vanilla longformer (#6411 ) * add encoder-decoder for roberta * fix headmask * apply Sylvains suggestions * fix typo * Apply suggestions from code review	2020-08-12 18:23:30 +02:00
Sylvain Gugger	d2370e1bd8	Adding PaddingDataCollator (#6442 ) * Data collator with padding * Add type annotation * Support tensors as well * Add comment * Fix for labels wrong shape * Data collator with padding * Add type annotation * Support tensors as well * Add comment * Fix for labels wrong shape * Remove changes rendered unnecessary	2020-08-12 11:32:27 -04:00
Sylvain Gugger	96c3329f19	Fix #6428 (#6437 )	2020-08-12 08:47:30 -04:00
Sylvain Gugger	34fabe1697	Move prediction_loss_only to TrainingArguments (#6426 )	2020-08-12 08:03:45 -04:00
Sylvain Gugger	e9c3031463	Fixes to make life easier with the nlp library (#6423 ) * allow using tokenizer.pad as a collate_fn in pytorch * allow using tokenizer.pad as a collate_fn in pytorch * Add documentation and tests * Make attention mask the right shape * Better test Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-08-12 08:00:56 -04:00
Jared T Nielsen	ac5bcf236e	Fix FFN dropout in TFAlbertLayer, and split dropout in TFAlbertAttent… (#4323 ) * Fix FFN dropout in TFAlbertLayer, and split dropout in TFAlbertAttention into two separate dropout layers. * Same dropout fixes for PyTorch.	2020-08-12 07:52:42 -04:00
Stas Bekman	ece0903e11	lr_schedulers: add get_polynomial_decay_schedule_with_warmup (#6361 ) * [wip] add get_polynomial_decay_schedule_with_warmup * style * add assert * change lr_end to a much smaller default number * check for exact equality * [model_cards] electra-base-turkish-cased-ner (#6350) * for electra-base-turkish-cased-ner * Add metadata Co-authored-by: Julien Chaumond <chaumond@gmail.com> * Temporarily de-activate TPU CI * Update modeling_tf_utils.py (#6372) fix typo: ckeckpoint->checkpoint * the test now works again (#6371) * correct pl link in readme (#6364) * refactor almost identical tests (#6339) * refactor almost identical tests * important to add a clear assert error message * make the assert error even more descriptive than the original bt * Small docfile fixes (#6328) * Patch models (#6326) * TFAlbertFor{TokenClassification, MultipleChoice} * Patch models * BERT and TF BERT info s * Update check_repo * Ci GitHub caching (#6382) * Cache Github Actions CI * Remove useless file * Colab button (#6389) * Add colab button * Add colab link for tutorials * Fix links for open in colab (#6391) * Update src/transformers/optimization.py consistently use lr_end=1e-7 default Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * [wip] add get_polynomial_decay_schedule_with_warmup * style * add assert * change lr_end to a much smaller default number * check for exact equality * Update src/transformers/optimization.py consistently use lr_end=1e-7 default Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * remove dup (leftover from merge) * convert the test into the new refactored format * stick to using the current_step as is, without ++ Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com> Co-authored-by: Julien Chaumond <chaumond@gmail.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Alexander Measure <ameasure@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-08-11 17:56:41 -04:00
Sam Shleifer	be1520d3a3	rename prepare_translation_batch -> prepare_seq2seq_batch (#6103 )	2020-08-11 15:57:07 -04:00
Sam Shleifer	66fa8ceaea	PegasusForConditionalGeneration (torch version) (#6340 ) Co-authored-by: Jingqing Zhang <jingqing.zhang15@imperial.ac.uk>	2020-08-11 14:31:23 -04:00
guillaume-be	404782912a	[Performance improvement] "Bad tokens ids" optimization (#6064 ) * Optimized banned token masking * Avoid duplicate EOS masking if in bad_words_id * Updated mask generation to handle empty banned token list * Addition of unit tests for the updated bad_words_ids masking * Updated timeout handling in `test_postprocess_next_token_scores_large_bad_words_list` unit test * Updated timeout handling in `test_postprocess_next_token_scores_large_bad_words_list` unit test (timeout does not work on Windows) * Moving Marian import to the test context to allow TF only environments to run * Moving imports to torch_available test * Updated operations device and test * Updated operations device and test * Added docstring and comment for in-place scores modification * Moving test to own test_generation_utils, use of lighter models for testing * removed unneded imports in test_modeling_common * revert formatting change for ModelTesterMixin * Updated caching, simplified eos token id test, removed unnecessary @require_torch * formatting compliance	2020-08-11 05:56:40 -04:00
David LaPalomento	87e124c245	Warn if debug requested without TPU fixes (#6308 ) (#6390 ) * Warn if debug requested without TPU fixes (#6308) Check whether a PyTorch compatible TPU is available before attempting to print TPU metrics after training has completed. This way, users who apply `--debug` without reading the documentation aren't suprised by a stacktrace. * Style Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-08-11 05:31:26 -04:00
Junyuan Zheng	cdf1f7edb2	Fix tokenizer saving and loading error (#6026 ) * fix tokenizer saving and loading bugs when adding AddedToken to additional special tokens * Add tokenizer test * Style * Style 2 Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-08-11 04:49:16 -04:00
Stas Bekman	83984a61c6	testing utils: capturing std streams context manager (#6231 ) * testing utils: capturing std streams context manager * style * missing import * add the origin of this code	2020-08-11 03:56:47 -04:00
Pradhy729	b25cec13c5	Feed forward chunking (#6024 ) * Chunked feed forward for Bert This is an initial implementation to test applying feed forward chunking for BERT. Will need additional modifications based on output and benchmark results. * Black and cleanup * Feed forward chunking in BertLayer class. * Isort * add chunking for all models * fix docs * Fix typo Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>	2020-08-11 03:12:45 -04:00
Patrick von Platen	00bb0b25ed	TF Longformer (#5764 ) * improve names and tests longformer * more and better tests for longformer * add first tf test * finalize tf basic op functions * fix merge * tf shape test passes * narrow down discrepancies * make longformer local attn tf work * correct tf longformer * add first global attn function * add more global longformer func * advance tf longformer * finish global attn * upload big model * finish all tests * correct false any statement * fix common tests * make all tests pass except keras save load * fix some tests * fix torch test import * finish tests * fix test * fix torch tf tests * add docs * finish docs * Update src/transformers/modeling_longformer.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_tf_longformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply Lysandres suggestions * reverse to assert statement because function will fail otherwise * applying sylvains recommendations * Update src/transformers/modeling_longformer.py Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * Update src/transformers/modeling_tf_longformer.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-08-10 23:25:06 +02:00
Patrick von Platen	3425936643	[EncoderDecoderModel] add a `add_cross_attention` boolean to config (#6377 ) * correct encoder decoder model * Apply suggestions from code review * apply sylvains suggestions	2020-08-10 19:46:48 +02:00
Lysandre Debut	b99098abc7	Patch models (#6326 ) * TFAlbertFor{TokenClassification, MultipleChoice} * Patch models * BERT and TF BERT info s * Update check_repo	2020-08-10 10:39:17 -04:00
Alexander Measure	3a556b0fb7	Update modeling_tf_utils.py (#6372 ) fix typo: ckeckpoint->checkpoint	2020-08-10 02:55:11 -04:00
Patrick von Platen	1aec991643	[GPT2] Correct typo in docs (#6352 )	2020-08-08 20:37:29 +02:00
Julien Plu	0e36e51515	Fix the tests for Electra (#6284 ) * Fix the tests for Electra * Apply style	2020-08-07 09:30:57 -04:00
Sylvain Gugger	6ba540b747	Add a script to check all models are tested and documented (#6298 ) * Add a script to check all models are tested and documented * Apply suggestions from code review Co-authored-by: Kevin Canwen Xu <canwenxu@126.com> * Address comments Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>	2020-08-07 09:18:37 -04:00
idoh	3be2d04884	fix consistency CrossEntropyLoss in modeling_bart (#6265 )	2020-08-07 17:44:28 +08:00
Lysandre Debut	0d9328f2ef	Patch GPU failures (#6281 ) * Pin to 1.5.0 * Patch XLM GPU test	2020-08-07 02:58:15 -04:00
Patrick von Platen	118ecfd427	fix for pytorch < 1.6 (#6300 )	2020-08-06 21:14:46 +02:00
Sam Shleifer	2804fff839	[s2s]Use prepare_translation_batch for Marian finetuning (#6293 ) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-06 14:58:38 -04:00
Teven	2f2aa0c89c	added `n_inner` argument to gpt2 config (#6296 )	2020-08-06 17:47:32 +02:00
Doug Blank	b923871bb7	Adds comet_ml to the list of auto-experiment loggers (#6176 ) * Support for Comet.ml * Need to import comet first * Log this model, not the one in the backprop step * Log args as hyperparameters; use framework to allow fine control * Log hyperparameters with context * Apply black formatting * isort fix integrations * isort fix __init__ * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/trainer_tf.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Address review comments * Style + Quality, remove Tensorboard import test Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-08-06 11:31:30 -04:00
Philip May	d5bc32ce92	Add strip_accents to basic BertTokenizer. (#6280 ) * Add strip_accents to basic tokenizer * Add tests for strip_accents. * fix style with black * Fix strip_accents test * empty commit to trigger CI * Improved strip_accents check * Add code quality with is not False	2020-08-06 18:52:28 +08:00
Sylvain Gugger	c67d1a0259	Tf model outputs (#6247 ) * TF outputs and test on BERT * Albert to DistilBert * All remaining TF models except T5 * Documentation * One file forgotten * TF outputs and test on BERT * Albert to DistilBert * All remaining TF models except T5 * Documentation * One file forgotten * Add new models and fix issues * Quality improvements * Add T5 * A bit of cleanup * Fix for slow tests * Style	2020-08-05 11:34:39 -04:00
Teven	bd0eab351a	Trainer + wandb quality of life logging tweaks (#6241 ) * added `name` argument for wandb logging, also logging model config with trainer arguments * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * added tf, post-review changes Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-05 09:05:52 -04:00
Julien Plu	33966811bd	Add SequenceClassification and MultipleChoice TF models to Electra (#6227 ) * Add SequenceClassification and MultipleChoice TF models to Electra * Apply style * Add summary_proj_to_labels to Electra config * Finally mirroring the PT version of these models * Apply style * Fix Electra test	2020-08-05 09:04:27 -04:00
Zhu Baohe	d89acd07cc	fix (#6257 )	2020-08-05 07:37:57 -04:00
Ninnart Fuengfusin	24c5a6e351	Update optimization.py (#6261 )	2020-08-05 07:34:57 -04:00
Lilian Bordeau	ed6b8f3128	Update to match renamed attributes in fairseq master (#5972 ) * Update to match renamed attributes in fairseq master RobertaModel no longer have model.encoder and args.num_classes attributes as of 5/28/20. * Quality Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-08-05 07:23:55 -04:00
Joe Davison	972535ea74	fix zero shot pipeline docs (#6245 )	2020-08-04 16:37:49 -04:00
Patrick von Platen	6c9ba1d8fc	[Reformer] Make random seed generator available on random seed and not on model device (#6244 ) * improve if else statement random seeds * Apply suggestions from code review * Update src/transformers/modeling_reformer.py	2020-08-04 13:22:43 -04:00
Sam Shleifer	d5b0a0e235	mBART Conversion script (#6230 )	2020-08-04 09:53:51 -04:00
Stas Bekman	268bf34630	typo (#6225 )	2020-08-04 09:31:49 -04:00
Andrés Felipe Cruz	7ea9b2db37	Encoder decoder config docs (#6195 ) * Adding docs for how to load encoder_decoder pretrained model with individual config objects * Adding docs for loading encoder_decoder config from pretrained folder * Fixing W293 blank line contains whitespace * Update src/transformers/modeling_encoder_decoder.py * Update src/transformers/modeling_encoder_decoder.py * Update src/transformers/modeling_encoder_decoder.py * Apply suggestions from code review model file should only show examples for how to load save model * Update src/transformers/configuration_encoder_decoder.py * Update src/transformers/configuration_encoder_decoder.py * fix space Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-08-04 09:23:28 +02:00
Gong Linyuan	b390a5672a	Make the order of additional special tokens deterministic (#5704 ) * Make the order of additional special tokens deterministic regardless of hash seeds * Fix	2020-08-04 02:38:30 -04:00
Kevin Canwen Xu	3c289fb38c	Remove outdated BERT tips (#6217 ) * Remove out-dated BERT tips * Update modeling_outputs.py * Update bert.rst * Update bert.rst	2020-08-04 01:17:56 +08:00
Sylvain Gugger	e4920c92d6	Doc pipelines (#6175 ) * Init work on pipelines doc * Work in progress * Work in progress * Doc pipelines * Rm unwanted default * Apply suggestions from code review Lysandre comments Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-08-03 11:44:46 -04:00
Maurice Gonzenbach	06f1692b02	Fix _shift_right function in TFT5PreTrainedModel (#6214 )	2020-08-03 16:21:23 +02:00
Suraj Patil	0b41867357	fix labels (#6213 )	2020-08-03 10:19:35 -04:00
Jay Mody	cedc547e7e	Adds train_batch_size, eval_batch_size, and n_gpu to to_sanitized_dict output for logging. (#5331 ) * Adds train_batch_size, eval_batch_size, and n_gpu to to_sanitized_dict() output * Update wandb config logging to use to_sanitized_dict * removed n_gpu from sanitized dict * fix quality check errors	2020-08-03 09:00:39 -04:00

1 2 3 4 5 ...

949 Commits