Commit Graph

318 Commits

Author SHA1 Message Date
LysandreJik
0ea82b246f Updated tests 2019-09-24 07:10:09 -04:00
LysandreJik
9d44236f70 Updated DistilBERT 2019-09-24 07:03:24 -04:00
LysandreJik
ab984a8b72 Python 2 compatibility 2019-09-19 15:01:33 +02:00
LysandreJik
3df208c93a Tokenizer accepts token list as well as string 2019-09-19 14:47:52 +02:00
LysandreJik
66ea76b8a9 prepare_for_model and prepare_pair_for_model methods. Added an option to select which sequence will be truncated. 2019-09-19 13:50:51 +02:00
LysandreJik
baa74326ab Stride + tests + small fixes 2019-09-19 10:55:06 +02:00
LysandreJik
c10c7d59e7 Mask computing in standalone method. Tests. 2019-09-19 10:55:06 +02:00
LysandreJik
bf503158c5 Sentence -> Sequence. Removed output_mask from the special token addition methods. 2019-09-19 10:55:06 +02:00
LysandreJik
8cba057260 Doc + remove artefacts 2019-09-19 10:55:06 +02:00
LysandreJik
6393261e41 encode + encode_plus tests modified 2019-09-19 10:55:06 +02:00
LysandreJik
dcc9bb3252 Modified encode to return only lists. Added a more complete encode_plus method 2019-09-19 10:55:06 +02:00
LysandreJik
af23b626c8 Max encoding length + corresponding tests 2019-09-19 10:55:06 +02:00
LysandreJik
c4d4f3ec8c Updated DistilBERT test to reflect the sequence encoding 2019-09-19 10:55:06 +02:00
LysandreJik
d572d7027b Number of added tokens calculator 2019-09-19 10:55:06 +02:00
LysandreJik
2d8ec5a684 Changed warning to be more explicit
Co-authored by: julien_c <chaumond@gmail.com>
2019-09-19 10:55:06 +02:00
LysandreJik
92a9976e91 Distilbert sequence builder w/ mask 2019-09-19 10:55:06 +02:00
LysandreJik
bac332fec0 Updated the GLUE data processor. Corrections to RoBERTa and XLNet. 2019-09-19 10:55:06 +02:00
LysandreJik
c3df2136e1 Added binary masking tests 2019-09-19 10:55:06 +02:00
LysandreJik
e391d4735e Tokenizers' encode function can output binary masks 2019-09-19 10:55:06 +02:00
erenup
46ffc28329 Merge branch 'master' into run_multiple_choice_merge
# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.
2019-09-18 21:43:46 +08:00
erenup
3cd6289758 Merge remote-tracking branch 'huggingface/master' into run_multiple_choice_merge
# Conflicts:
#	examples/contrib/run_swag.py
2019-09-18 21:16:59 +08:00
Julien Chaumond
62760baf46 tiny fixes 2019-09-17 18:29:15 -04:00
thomwolf
45de034bf8 fix #1223 2019-09-17 10:25:06 +02:00
erenup
a9debaca3d fixed init_weight 2019-09-16 19:55:24 +08:00
erenup
982f181aa7 Merge remote-tracking branch 'origin/master' into run_multiple_choice_add_doc 2019-09-16 19:12:00 +08:00
erenup
84b9d1c423 Merge remote-tracking branch 'huggingface/master'
# Conflicts:
#	pytorch_transformers/__init__.py
2019-09-16 19:06:12 +08:00
erenup
4812a5a767 add doc string 2019-09-16 11:50:18 +08:00
Zili Wang
8bdee1cb73 fixed: hard coding for max and min number will out of range in fp16, which will cause nan. 2019-09-11 15:41:53 +08:00
mattolson93
f2cf6ce4a9 Fixing typo in gpt2 for doc site's class link 2019-09-10 09:12:01 -07:00
Thomas Wolf
2c177a87eb Merge pull request #1228 from huggingface/head-masking-test
Trying to fix the head masking test
2019-09-10 11:55:27 +02:00
Thomas Wolf
3f05de6dde Merge branch 'master' into reorder_arguments 2019-09-09 15:42:25 +03:00
thomwolf
3401980fc4 fix #1208 2019-09-09 10:22:12 +03:00
Thomas Wolf
5ac8b62265 Merge pull request #1205 from maru0kun/patch-2
Fix typo
2019-09-05 21:44:16 +02:00
thomwolf
5c6cac102b adding test for common properties and cleaning up a bit base class 2019-09-05 21:31:29 +02:00
maru0kun
d737947725 Fix typo 2019-09-05 19:24:57 +09:00
thomwolf
85df4f7cca also gathering file names in file_utils 2019-09-05 02:34:09 +02:00
thomwolf
121f88cae3 update conversion scripts 2019-09-05 02:17:50 +02:00
thomwolf
d77abd4d08 clean ups 2019-09-05 00:41:24 +02:00
thomwolf
2a667b1eb9 split configuration and modeling files 2019-09-05 00:27:11 +02:00
thomwolf
0be6a2a624 be sure we have uint8 2019-09-04 22:47:38 +02:00
thomwolf
7fba47b7d9 WIP reordering 2019-09-04 22:39:23 +02:00
thomwolf
e25cba78cf WIP reodering arguments for torchscript and TF 2019-09-04 22:39:23 +02:00
thomwolf
38b79b5a63 Fixing this TransformerXL bool issue 2019-09-04 22:36:30 +02:00
thomwolf
89fd3450a6 Release: 1.2.0 2019-09-04 13:32:18 +02:00
Shijie Wu
a15562e170 Fix reference of import when called for the second time 2019-09-03 18:27:29 -07:00
Thomas Wolf
0287d264e9 Merge pull request #1162 from huggingface/xlnet-bias
XLNet bias fix on resize embeddings (cf #1124)
2019-09-02 23:14:04 +02:00
LysandreJik
31d3373bc9 Appends space before special token 2019-09-01 21:07:00 -04:00
thomwolf
fede4ef45d fixing #1133 2019-09-02 02:27:39 +02:00
Thomas Wolf
ff7368eb6b Merge pull request #1077 from huggingface/pruning-save-and-load
Pruning changes so that deleted heads are kept on save/load
2019-09-01 09:42:15 +02:00
LysandreJik
6ae0bb5291 XLM 100 different URLs 2019-08-31 14:46:31 -04:00