HuggingFace_transformer

Author	SHA1	Message	Date
Rémi Louf	81ee29ee8d	remove the staticmethod used to load the config	2019-10-10 14:13:37 +02:00
Rémi Louf	d7092d592c	rename the attributes in the Bert Layer Since the preloading of weights relies on the name of the class's attributes changing the namespace breaks loading pretrained weights on Bert and all related models. I reverted `self_attention` to `attention` and us `crossattention` for the decoder instead.	2019-10-10 12:51:14 +02:00
Rémi Louf	51261167b4	prune both attention and self-attention heads	2019-10-10 12:17:22 +02:00
Rémi Louf	17177e7379	add is_decoder as an attribute to Config class	2019-10-10 12:03:58 +02:00
Rémi Louf	df85a0ff0b	replace double quotes with simple quotes	2019-10-10 11:38:26 +02:00
Rémi Louf	9ca788b2e8	merge the two Bert layers classes	2019-10-10 11:33:28 +02:00
Rémi Louf	edfc8f8225	Remove and do the branching in	2019-10-10 10:17:27 +02:00
Rémi Louf	09cfd12235	remove and do the branching in	2019-10-10 10:15:27 +02:00
Rémi Louf	877ef2c6ca	override `from_pretrained` in Bert2Rnd In the seq2seq model we need to both load pretrained weights in the encoder and initialize the decoder randomly. Because the `from_pretrained` method defined in the base class relies on module names to assign weights, it would also initialize the decoder with pretrained weights. To avoid this we override the method to only initialize the encoder with pretrained weights.	2019-10-10 10:02:18 +02:00
Rémi Louf	851ef592c5	add comment on recursive weights loading	2019-10-10 10:02:03 +02:00
Rémi Louf	770b15b58c	rename class in __init__	2019-10-08 17:32:28 +02:00
Rémi Louf	61ed889005	remove old seq2seq file	2019-10-08 16:30:58 +02:00
Rémi Louf	8abfee9ec3	rename Bert2Bert -> Bert2Rnd	2019-10-08 16:30:58 +02:00
Rémi Louf	82628b0fc9	add a placeholder test	2019-10-08 16:30:58 +02:00
Rémi Louf	0700983090	Add BertDecoderModel and Bert2Bert classes I am not sure what happens when the class is initialized with the pretrained weights.	2019-10-08 16:30:58 +02:00
Rémi Louf	75feacf172	add general structure for Bert2Bert class	2019-10-08 16:30:58 +02:00
Rémi Louf	15a2fc88a6	add General attention classes The modifications that I introduced in a previous commit did break Bert's internal API. I reverted these changes and added more general classes to handle the encoder-decoder attention case. There may be a more elegant way to deal with retro-compatibility (I am not comfortable with the current state of the code), but I cannot see it right now.	2019-10-08 16:30:58 +02:00
Rémi Louf	cd6a59d5c1	add a decoder layer for Bert	2019-10-08 16:30:58 +02:00
Rémi Louf	a0dcefa382	generalize BertSelfAttention to take separate query, key, value There is currently no way to specify the quey, key and value separately in the Attention module. However, the decoder's "encoder-decoder attention" layers take the decoder's last output as a query, the encoder's states as key and value. We thus modify the existing code so query, key and value can be added separately. This obviously poses some naming conventions; `BertSelfAttention` is not a self-attention module anymore. The way the residual is forwarded is now awkard, etc. We will need to do some refacto once the decoder is fully implemented.	2019-10-07 17:53:58 +02:00
Rémi Louf	31adbb247c	add class wireframes for Bert decoder	2019-10-07 16:43:21 +02:00
Rémi Louf	dda1adad6d	rename BertLayer to BertEncoderLayer	2019-10-07 16:31:46 +02:00
Rémi Louf	0053c0e052	do some (light) housekeeping Several packages were imported but never used, indentation and line spaces did not follow PEP8.	2019-10-07 16:29:15 +02:00
Rémi Louf	386e86e222	raise exception when class initialized with __init__	2019-10-07 13:00:06 +02:00
Rémi Louf	4446c02b8a	add wireframe for seq2seq model	2019-10-07 12:04:05 +02:00
LysandreJik	7bddb45a6f	Decode documentaton	2019-10-04 14:27:38 -04:00
Thomas Wolf	1569610f2d	Merge pull request #1296 from danai-antoniou/add-duplicate-tokens-error Added ValueError for duplicates in list of added tokens	2019-10-03 17:06:17 -04:00
drc10723	e1b2949ae6	DistillBert Documentation Code Example fixes	2019-10-03 15:51:33 -04:00
VictorSanh	6be46a6e64	update links to new weights	2019-10-03 10:27:11 -04:00
VictorSanh	f1f23ad171	fix buf in convert_pt_chkpt_to_tf2	2019-10-03 10:27:11 -04:00
Santiago Castro	63ed224b7c	initialy -> initially	2019-10-02 15:04:18 +00:00
danai-antoniou	a95158518d	Moved duplicate token check	2019-10-02 07:44:15 +01:00
danai-antoniou	d73957899a	Merge branch 'master' of https://github.com/danai-antoniou/pytorch-transformers into add-duplicate-tokens-error	2019-10-02 07:38:50 +01:00
thomwolf	391db836ab	fix #1260 - remove special logic for decoding pairs of sequence	2019-10-01 19:09:13 -04:00
thomwolf	c50783e388	Merge branch 'pooler_end_logits_fp16_fix' of https://github.com/slayton58/pytorch-transformers into pr/1284	2019-10-01 18:17:48 -04:00
VictorSanh	2dc8cb8734	fix unknown imports (*ForMultipleChoice) in run_multiple_choice	2019-09-29 19:51:01 -04:00
Ikuya Yamada	a6a6d9e638	fix padding_idx of RoBERTa model	2019-09-27 19:03:55 -04:00
Julien Chaumond	d8b641c839	6 -> 8 models	2019-09-27 17:22:01 -04:00
Julien Chaumond	c6acbdd50a	Close #1304	2019-09-27 17:02:53 -04:00
Agrin Hilmkil	795b3e76ff	Add docstring for processor method	2019-09-27 17:32:28 +02:00
Agrin Hilmkil	e31a472801	Fix tensorflow_dataset glue support `glue_convert_examples_to_features` assumed that tensorflow_dataset examples contains the features `'sentence1'` and `'sentence2'`. This commit encapsulates the choice of features in the glue processor and uses that to parse examples.	2019-09-27 17:16:02 +02:00
LysandreJik	ecfddc6034	Update RoBERTa and GPT-2 Tokenizer documentation (fix #1343 )	2019-09-26 16:49:03 -04:00
LysandreJik	36f592cc82	Updated doc for `InputExample` and `InputFeatures`	2019-09-26 07:45:40 -04:00
LysandreJik	ad4a393e2e	Changed processor documentation architecture. Added documentation for GLUE	2019-09-26 07:45:40 -04:00
thomwolf	80bf868a26	Merge branch 'master' into tf2	2019-09-26 12:04:47 +02:00
thomwolf	481d9c4fb5	Merge branch 'master' into tf2	2019-09-26 12:02:54 +02:00
thomwolf	31c23bd5ee	[BIG] pytorch-transformers => transformers	2019-09-26 10:15:53 +02:00

46 Commits