Commit Graph

58 Commits

Author SHA1 Message Date
LysandreJik
fc1fbae45d XLM can be pruned 2019-08-31 00:33:50 -04:00
Lysandre
42e00cf9e1 Pruning saved to configuration first try 2019-08-31 00:33:50 -04:00
LysandreJik
d7a4c3252e Fixed filename 2019-08-31 00:08:56 -04:00
LysandreJik
7f006cdd87 Set seed for head_masking test 2019-08-30 23:58:49 -04:00
Thomas Wolf
d483cd8e46 Merge pull request #1074 from huggingface/improved_testing
Shortcut to special tokens' ids - fix GPT2 & RoBERTa tokenizers - improved testing for GPT/GPT-2
2019-08-30 23:18:58 +02:00
Thomas Wolf
d2f21f08f5 Merge pull request #1092 from shijie-wu/xlm-tokenization
Added cleaned configuration properties for tokenizer with serialization - improve tokenization of XLM
2019-08-30 23:15:40 +02:00
LysandreJik
25e8389439 Tests for added AutoModels 2019-08-30 12:48:55 -04:00
thomwolf
7044ed6b05 fix tokenizers serialization 2019-08-30 17:36:11 +02:00
Thomas Wolf
cd65c41a83 Merge branch 'master' into xlm-tokenization 2019-08-30 17:15:16 +02:00
thomwolf
69da972ace added test and debug tokenizer configuration serialization 2019-08-30 17:09:36 +02:00
Thomas Wolf
50e615f43d Merge branch 'master' into improved_testing 2019-08-30 13:40:35 +02:00
thomwolf
ce5ef4b35d python2 doesn't spark joy 2019-08-30 13:22:43 +02:00
thomwolf
5dd7b677ad clean up all byte-level bpe tests 2019-08-30 12:43:08 +02:00
thomwolf
ca1a00a302 fix for python2 2019-08-30 12:29:31 +02:00
thomwolf
abe734ca1f fix GPT-2 and RoBERTa tests to be clean now 2019-08-30 12:20:18 +02:00
thomwolf
d51f72d5de adding shortcut to the ids of all the special tokens 2019-08-30 11:41:11 +02:00
thomwolf
912a377e90 dilbert -> distilbert 2019-08-28 13:59:42 +02:00
thomwolf
c9bce1811c fixing model to add torchscript, embedding resizing, head pruning and masking + tests 2019-08-28 13:22:45 +02:00
thomwolf
62df4ba59a add dilbert tokenizer and tests 2019-08-28 12:22:56 +02:00
LysandreJik
c513415b19 Dilbert tests from CommonTests 2019-08-27 23:59:00 -04:00
Lysandre
55f69a11b6 OpenAI GPT tests now extend CommonTests 2019-08-21 18:09:25 -04:00
Lysandre
47267ba556 OpenAI GPT-2 now depends on CommonTests. 2019-08-21 17:50:16 -04:00
Thomas Wolf
e753f249e1 Merge pull request #806 from wschin/fix-a-path
Fix a path so that a test can run on Windows
2019-08-21 01:14:40 +02:00
LysandreJik
3d87991f60 Fixed error with encoding 2019-08-13 12:00:24 -04:00
LysandreJik
634a3172d8 Added integration tests for sequence builders. 2019-08-12 15:14:15 -04:00
LysandreJik
75d5f98fd2 Roberta tokenization + fixed tests (py3 + py2). 2019-08-09 15:02:13 -04:00
LysandreJik
14e970c271 Tokenization encode/decode class-based sequence handling 2019-08-09 15:01:38 -04:00
LysandreJik
fbd746bd06 Updated test architecture 2019-08-08 18:21:34 -04:00
Julien Chaumond
9d0603148b [RoBERTa] RobertaForSequenceClassification + conversion 2019-08-08 11:24:54 -04:00
LysandreJik
d2cc6b101e Merge branch 'master' into RoBERTa 2019-08-08 09:42:05 -04:00
LysandreJik
770043eea2 Sentence-pair tasks handling. Using common tests on RoBERTa. Forced push to fix indentation. 2019-08-07 12:53:19 -04:00
thomwolf
0b524b0848 remove derived classes for now 2019-08-05 19:08:19 +02:00
thomwolf
13936a9621 update doc and tests 2019-08-05 18:48:16 +02:00
thomwolf
ed4e542260 adding tests 2019-08-05 18:14:07 +02:00
thomwolf
328afb7097 cleaning up tokenizer tests structure (at last) - last remaining ppb refs 2019-08-05 14:08:56 +02:00
Julien Chaumond
cb9db101c7 Python 2 must DIE 2019-08-04 22:04:15 -04:00
Julien Chaumond
05c083520a [RoBERTa] model conversion, inference, tests 🔥 2019-08-04 21:39:21 -04:00
thomwolf
0740e63e49 updating schedules for state_dict saving 2019-07-23 15:57:18 +02:00
Wei-Sheng Chin
c4e9615691 Fix a path so that test can run on Windows 2019-07-17 09:08:40 -07:00
thomwolf
1849aa7d39 update readme and pretrained model weight files 2019-07-16 15:11:29 +02:00
thomwolf
e691fc0963 update QA models tests + run_generation 2019-07-15 17:45:24 +02:00
thomwolf
15d8b1266c update tokenizer - update squad example for xlnet 2019-07-15 17:30:42 +02:00
thomwolf
7d4b200e40 good quality generation example for GPT, GPT-2, Transfo-XL, XLNet 2019-07-13 15:25:03 +02:00
thomwolf
2918b7d2a0 updating tests 2019-07-12 10:57:58 +02:00
LysandreJik
3fbceed8d2 Fix layer reference loss + previous attempted fix 2019-07-11 22:29:55 -04:00
LysandreJik
6c2ee16c04 Test suite testing the tie_weights function as well as the resize_token_embeddings function.
Patched an issue relating to the tied weights I had introduced with the TorchScript addition.
Byte order mark management in TSV glue reading.
2019-07-11 22:09:16 -04:00
thomwolf
bd404735a7 embeddings resizing + tie_weights 2019-07-12 00:02:49 +02:00
thomwolf
c6bf1a400d fix test examples et model pretrained 2019-07-11 22:29:08 +02:00
thomwolf
ccb6947dc1 optimization tests 2019-07-11 17:39:47 +02:00
thomwolf
ec07cf5a66 rewamp optimization 2019-07-11 14:48:22 +02:00