HuggingFace_transformer

Author	SHA1	Message	Date
Julien Plu	c8d3fa0dfd	Check TF ops for ONNX compliance (#10025 ) * Add check-ops script * Finish to implement check_tf_ops and start the test * Make the test mandatory only for BERT * Update tf_ops folder * Remove useless classes * Add the ONNX test for GPT2 and BART * Add a onnxruntime slow test + better opset flexibility * Fix test + apply style * fix tests * Switch min opset from 12 to 10 * Update src/transformers/file_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Fix GPT2 * Remove extra shape_list usage * Fix GPT2 * Address Morgan's comments Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-02-15 07:55:10 -05:00
Stas Bekman	8fae93ca19	[t5 tokenizer] add info logs (#9897 ) * save fast tokenizer + add info logs * fix tests * remove the saving of fast tokenizer	2021-02-13 09:10:22 -05:00
Mohamed Al Salti	1321356bdf	Fix typo in GPT2DoubleHeadsModel docs (#10148 ) * Fix typo * apply suggestion Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-02-12 22:48:39 +05:30
Patrick von Platen	495c157d6f	[Wav2Vec2] Improve Tokenizer & Model for batched inference (#10117 ) * save intermediate * finish batch the same as fairseq * add normalization * fix batched input * add better comment * Update src/transformers/models/wav2vec2/modeling_wav2vec2.py * add nice docstring * add tokenizer tests * make all slow tests pass * finish PR * correct import	2021-02-11 15:40:54 +03:00
Suraj Patil	c130e67dce	remove adjust_logits_during_generation method (#10087 ) * add forced logits processors * delete adjust_logits method * add forced_eos_token_id argument in config * add tests for forced logits processors * update gen utils tests * add forced option to tf generate * remove adjust_logits method from tf models * update adjust_logits for marian * delete _force_token_id_to_be_generated method * style * import warnings * pass max_length to _get_logits_processor * set forced_eos_token_id to None * set forced attributes in conf utils * typo * fix rag generate * add forced_eos_token_id in rag config * remove force_bos_token_to_be_generated from BartConfig * remove _force_token_ids_generation from FSMT * nit * fix negative constant * apply suggestions from code review	2021-02-10 22:39:09 +05:30
Julien Plu	22a32cf485	Fix TF LED/Longformer attentions computation (#10007 ) * Fix test * Remove commented test * Fix name * Apply style * Fix check copies * Remove prints * Restore boolean * Fix reshape	2021-02-10 10:58:37 -05:00
Suraj Patil	3e0c62b611	[RAG] fix generate (#10094 ) * fix rag generate and tests * put back adjust_logits_during_generation * tests are okay Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-02-09 21:57:38 +03:00
Julien Plu	b82fe7d258	Replace strided slice with tf.expand_dims (#10078 ) * Replace tf.newaxis -> tf.expand_dims * Fix tests * Fix tests * Use reshape when a tensors needs a double expand * Fix GPT2 * Fix GPT2	2021-02-09 11:48:28 -05:00
Daniel Stancl	e7381c4596	Add head_mask and decoder_head_mask to TF LED (#9988 ) * Add head masking to TF LED * Add head_mask to Longformer + one doc piece to LED * Fix integration tests	2021-02-09 11:45:18 -05:00
Julien Plu	c6d5e56595	Fix naming (#10095 )	2021-02-09 06:10:31 -05:00
abhishek thakur	4ed763779e	Fix example in Wav2Vec2 documentation (#10096 ) * Fix example in Wav2Vec2 documentation * fix style	2021-02-09 06:07:56 -05:00
Patrick von Platen	b972125ced	Deprecate Wav2Vec2ForMaskedLM and add Wav2Vec2ForCTC (#10089 ) * add wav2vec2CTC and deprecate for maskedlm * remove from docs	2021-02-09 03:49:02 -05:00
demSd	84acf0c7bb	remove token_type_ids from TokenizerBertGeneration output (#10070 )	2021-02-08 13:05:32 -05:00
Julien Plu	31563e056d	Restore TF embeddings and attention layers to their previous version (#9890 ) * Refacto BERT * Restore all the concerned models * Remove print * Update template * Apply Sylvain's and Morgan's comments * Fix cast * Put the cast inside call * Remove cond in ebds * Fix funnel * Restore previous dot product (attention_scores) computation * Add ConvBERT and BART * Make all the S2S models ONNX compliant * Fix test * Fix check copies	2021-02-08 14:36:30 +03:00
Nicolas Patry	b1aa4982cd	Cleaning up `ConversationalPipeline` to support more than DialoGPT. (#10002 ) * Cleaning up `ConversationalPipeline` to support more than DialoGPT. Currently ConversationalPipeline was heavily biased towards DialoGPT ,which is the default model for this pipeline. This PR proposes changes to put back the modifications specific to DialoGPT into tokenizer-specific behavior wherever possible, by creating `_build_conversation_input_ids` function that takes conversation as input, and returns a list of ints corresponding to the tokens. It feels natural to put here because all models have probably different strategies to build input_ids from the full conversation and it's the tokenizer's job to transform strings into tokens (and vice-versa) If `_build_conversation_input_ids` is missing, previous behavior is used so we don't break anything so far (except for blenderbot where it's a fix). This PR also contains a fix for too long inputs. There used to be dead code for trying to limit the size of incoming input. The introduced fixed is that we limit within `_build_conversation_input_ids` to `tokenizer.model_max_length`. It corresponds to the intent of the removed dead code and is actually better because it corresponds to `model_max_length` which is different from `max_length` (which is a default parameter for `generate`). - Removed `history` logic from the Conversation as it's not relevant anymore because tokenization logic has been moved to tokenizer. And tokenizer cannot save any cache, and conversation cannot know what is relevant or not. Also it's not usable from `blenderbot` because the input_ids are not append only (EOS tokens is always at the end). - Added `iter_texts` method on `Conversation` because all the code was literred with some form of this iteration of past/generated_responses. * Removing torch mention in types. * Adding type checking to `_build_conversation_input_ids`. * Fixing import in strings.	2021-02-08 14:29:07 +03:00
Suraj Patil	1cd16512dc	[examples/seq2seq] support label smoothing (#9844 ) * add prepare_decoder_input_ids_from_labels in s2s models * support lbl smoothing and enc/emb freezing * fix freezing * use pad_token_id from config * remove embed freezing and add warning * prepare decoder_input_ids inside DataCollatorForSeq2Seq	2021-02-05 23:21:57 +05:30
Sylvain Gugger	21b3922e35	Authorize last version of tokenizer (#9799 ) * Authorize last version of tokenizer * Update version table * Fix conversion of spm tokenizers and fix some hub links * Bump tokenizers version to 0.10.1rc1 * Add script to check tokenizers conversion with XNLI * Add some more mask_token lstrip support * Must modify mask_token in slow tokenizers too * Keep using the old method for Pegasus * add missing import Co-authored-by: Anthony MOI <m.anthony.moi@gmail.com>	2021-02-04 14:18:33 -05:00
Daniel Stancl	714855bd8f	Remove "double" assignment in TF-BART like models (#9997 ) * Replace `attn_weights = attn_wegihts = tf.reshape(...)` with `attn_weights = tf.reshape(...)` and thus remove unintentionally used "double" assignment.	2021-02-04 10:24:47 -05:00
Nicolas Patry	aeb18b9224	Adding new `encoder_no_repeat_ngram_size` to `generate`. (#9984 ) Adding new `encoder_no_repeat_ngram_size` to `generate`. Blenderbot results seemed off compared to original ParlAI script: `https://parl.ai/projects/recipes/`. Notably the model seems to repeat a lot what was said during the conversation. The actual problem was that `no_repeat_ngram_size` actually applies to the `encoder_input_ids` but HF's `no_repeat_ngram_size` applies to the previously generated ids (within the decoder). The history conversation of blenderbot is within the `encoder` part so that explains why HF's implementation had the repetitions. This fix was focused on blenderbot not small and added tests for those because they are quite different in configuration. This change includes: - Adding a new EncoderNoRepeatLogitProcessor. - Adding 1 new arg to `generate` (`encoder_no_repeat_ngram_size`) - Adding 1 new config parameter `encoder_no_repeat_ngram_size`. - Adding 2 tests, one for the pipeline (high level, inputs exhibited repeat behavior, one low level for EncoderNoRepeatLogitProcessor) - Factored NoRepeatLogitProcessor so that logic could be reused. Further work: - Blenderbot conversational pipeline still does not behave correctly as they way input is prepared within the pipeline is still incorrect (follow up PR) - Blenderbot allows the bot to have personas, which is done by prepending "your personna: XXXX" to the input, this could be explored too in a follow up PR. @patrickvonplaten @LysandreJik * Update src/transformers/generation_logits_process.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/configuration_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Doc quality. * Fixing test. * Last fixes. * Fixing to account for batch_size. * Update src/transformers/configuration_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/generation_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-02-04 15:00:18 +01:00
Lysandre Debut	e89c959af9	Fix model templates (#9999 )	2021-02-04 07:47:26 -05:00
demSd	00031785a8	BartForCausalLM analogs to `ProphetNetForCausalLM` (#9128 ) * initiliaze bart4causalLM * create BartDecoderWrapper, setters/getters * delete spaces * forward and additional methods * update cache function, loss function, remove ngram* params in data class. * add bartcausallm, bartdecoder testing * correct bart for causal lm * remove at * add mbart as well * up * fix typo * up * correct * add pegasusforcausallm * add blenderbotforcausallm * add blenderbotsmallforcausallm * add marianforcausallm * add test for MarianForCausalLM * add Pegasus test * add BlenderbotSmall test * add blenderbot test * fix a fail * fix an import fail * a fix * fix * Update modeling_pegasus.py * fix models * fix inputs_embeds setting getter * adapt tests * correct repo utils check * finish test improvement * fix tf models as well * make style * make fix-copies * fix copies * run all tests * last changes * fix all tests Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-02-04 11:56:12 +03:00
Sylvain Gugger	7898fc03b1	Add `from_slow` in fast tokenizers build and fixes some bugs (#9987 )	2021-02-04 03:34:23 -05:00
Stefan Schweter	6244727e05	distilbert: fix creation of sinusoidal embeddings when using PyTorch 1.8+ (#9917 )	2021-02-03 11:42:16 -05:00
Julien Plu	3f77c26d74	Fix Longformer and LED (#9942 ) * Fix Longformer and LED * Add a test for graph execution with inputs_embeds * Apply style	2021-02-03 12:26:32 +01:00
abhishek thakur	a1a67a3ced	Fix GroupedLinearLayer in TF ConvBERT (#9972 )	2021-02-03 04:49:07 -05:00
Daniel Stancl	71bdc076dd	Add head_mask and decoder_head_mask to PyTorch LED (#9856 ) * Add {decoder_,}head_mask to LED * Fix create_custom_forward signatue in encoder * Add head_mask to longformer * Add head_mask to longformer to fix dependencies of LED on Longformer. * Not working yet * Add mising one input in longofrmer_modeling.py * make fix-copies	2021-02-02 11:06:52 -08:00
Patrick von Platen	d6217fb30c	Wav2Vec2 (#9659 ) * add raw scaffold * implement feat extract layers * make style * remove + * correctly convert weights * make feat extractor work * make feature extraction proj work * run forward pass * finish forward pass * Succesful decoding example * remove unused files * more changes * add wav2vec tokenizer * add new structure * fix run forward * add other layer norm architecture * finish 2nd structure * add model tests * finish tests for tok and model * clean-up * make style * finish docstring for model and config * make style * correct docstring * correct tests * change checkpoints to fairseq * fix examples * finish wav2vec2 * make style * apply sylvains suggestions * apply lysandres suggestions * change print to log.info * re-add assert statement * add input_values as required input name * finish wav2vec2 tokenizer * Update tests/test_tokenization_wav2vec2.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * apply sylvains suggestions Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-02-02 15:52:10 +03:00
Stefan Schweter	aa438a4265	convbert: minor fixes for conversion script (#9937 )	2021-02-02 06:09:24 -05:00
Sylvain Gugger	de38a6e4d2	Fix 9918 (#9932 ) * Initial work * Fix doc styler and other models	2021-02-02 05:22:20 -05:00
Patrick von Platen	0f4dc5d864	fix typo in naming (#9944 )	2021-02-02 12:22:42 +03:00
Patrick von Platen	538b3b4607	[Tokenizer Utils Base] Make pad function more flexible (#9928 ) * change tokenizer requirement * split line * Correct typo from list to str * improve style * make other function pretty as well * add comment * correct typo * add new test * pass tests for tok without padding token * Apply suggestions from code review	2021-02-02 10:35:27 +03:00
Jan Jitse Venselaar	d1b14c9b54	Tensorflow doc changes on loss output size (#9922 ) * Change documentation to correctly specify loss tensor size * Change documentation to correct input format for labels * Corrected output size of loss tensor for sequence classifier, multiple choice model and question answering	2021-02-01 11:17:50 -05:00
Suraj Patil	343057e141	Fix bart conversion script (#9923 ) * fix conversion script * typo * import nn	2021-02-01 19:17:14 +03:00
Suraj Patil	0842c33edd	fix typos (#9924 )	2021-02-01 08:17:45 -05:00
Daniel Stancl	0c6c0afc0e	Add head_mask and decoder_head_mask to FSMT (#9819 ) * Add {decoder_,}head_mask to fsmt_modeling.py * Enable test_headmasking and some changes to docs * Remove test_head_masking flag from fsmt test file Remove test_head_masking flag from test_modeling_fsmt.py since test_head_masking is set to be True by default (thus it is redundant to store). * Merge master and remove test_head_masking = True * Rebase necessary due to an update of jaxlib * Remove test_head_masking=True in tests/test_modeling_fsmt.py as it is redundant.	2021-02-01 09:30:21 +03:00
Kiyoung Kim	74f16b8276	TFBart lables consider both pad token and -100 (#9847 ) * TFBart lables consider both pad token and -100 * make style * fix for all other models Co-authored-by: kykim <kykim> Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>	2021-02-01 01:31:29 +03:00
Funtowicz Morgan	2ee9f9b69e	Fix computation of attention_probs when head_mask is provided. (#9853 ) * Fix computation of attention_probs when head_mask is provided. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Apply changes to the template Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2021-01-28 06:11:52 -05:00
Stefan Schweter	5ed5a54684	ADD BORT (#9813 ) * tests: add integration tests for new Bort model * bort: add conversion script from Gluonnlp to Transformers 🚀 * bort: minor cleanup (BORT -> Bort) * add docs * make fix-copies * clean doc a bit * correct docs * Update docs/source/model_doc/bort.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/bort.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * correct dialogpt doc * correct link * Update docs/source/model_doc/bort.rst * Update docs/source/model_doc/dialogpt.rst Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-01-27 21:25:11 +03:00
Julien Plu	bd701ab1a0	Fix template (#9840 )	2021-01-27 07:40:30 -05:00
Julien Plu	4adbdce5ee	Clean TF Bert (#9788 ) * Start cleaning BERT * Clean BERT and all those depends of it * Fix attribute name * Apply style * Apply Sylvain's comments * Apply Lysandre's comments * remove unused import	2021-01-27 11:28:11 +01:00
Julien Plu	a1720694a5	Remove a TF usage warning and rework the documentation (#9756 ) * Rework documentation * Update the template * Trigger CI * Restore the warning but with the TF logger * Update convbert doc	2021-01-27 10:45:42 +01:00
Patrick von Platen	a46050d0f5	fix typo with mt5 init (#9830 )	2021-01-27 04:09:56 -05:00
abhishek thakur	f617490e71	ConvBERT Model (#9717 ) * finalize convbert * finalize convbert * fix * fix * fix * push * fix * tf image patches * fix torch model * tf tests * conversion * everything aligned * remove print * tf tests * fix tf * make tf tests pass * everything works * fix init * fix * special treatment for sepconv1d * style * 🙏🏽 * add doc and cleanup * add electra test again * fix doc * fix doc again * fix doc again * Update src/transformers/modeling_tf_pytorch_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/conv_bert/configuration_conv_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update docs/source/model_doc/conv_bert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/auto/configuration_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/conv_bert/configuration_conv_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * conv_bert -> convbert * more fixes from review * add conversion script * dont use pretrained embed * unused config * suggestions from julien * some more fixes * p -> param * fix copyright * fix doc * Update src/transformers/models/convbert/configuration_convbert.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * comments from reviews * fix-copies * fix style * revert shape_list Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-01-27 03:20:09 -05:00
Patrick von Platen	e575e06287	fix led not defined (#9828 )	2021-01-27 10:43:14 +03:00
Derrick Blakely	8edc98bb70	Allow RAG to output decoder cross-attentions (#9789 ) * get cross attns * add cross-attns doc strings * fix typo * line length * Apply suggestions from code review Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com> Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>	2021-01-26 20:32:46 +03:00
Michael Glass	c37dcff764	Fixed parameter name for logits_processor (#9790 )	2021-01-26 18:44:02 +03:00
Daniel Stancl	1867d9a8d7	Add head_mask/decoder_head_mask for TF BART models (#9639 ) * Add head_mask/decoder_head_mask for TF BART models * Add head_mask and decoder_head_mask input arguments for TF BART-based models as a TF counterpart to the PR #9569 * Add test_headmasking functionality to tests/test_modeling_tf_common.py * TODO: Add a test to verify that we can get a gradient back for importance score computation * Remove redundant #TODO note Remove redundant #TODO note from tests/test_modeling_tf_common.py * Fix assertions * Make style * Fix ...Model input args and adjust one new test * Add back head_mask and decoder_head_mask to BART-based ...Model after the last commit * Remove head_mask ande decoder_head_mask from input_dict in TF test_train_pipeline_custom_model as these two have different shape than other input args (Necessary for passing this test) * Revert adding global_rng in test_modeling_tf_common.py	2021-01-26 03:50:00 -05:00
Lysandre Debut	0f443436fb	Actual fix (#9787 )	2021-01-25 11:12:07 -05:00
Stas Bekman	fac7cfb16a	[fsmt] onnx triu workaround (#9738 ) * onnx triu workaround * style * working this time * add test * more efficient version	2021-01-25 08:57:37 -05:00
Stas Bekman	b7b7e5d049	token_type_ids isn't used (#9736 )	2021-01-22 20:38:53 -08:00

1 2 3 4 5

201 Commits