LysandreJik
6393261e41
encode + encode_plus tests modified
2019-09-19 10:55:06 +02:00
LysandreJik
af23b626c8
Max encoding length + corresponding tests
2019-09-19 10:55:06 +02:00
LysandreJik
c4d4f3ec8c
Updated DistilBERT test to reflect the sequence encoding
2019-09-19 10:55:06 +02:00
LysandreJik
d572d7027b
Number of added tokens calculator
2019-09-19 10:55:06 +02:00
LysandreJik
c3df2136e1
Added binary masking tests
2019-09-19 10:55:06 +02:00
Thomas Wolf
2c177a87eb
Merge pull request #1228 from huggingface/head-masking-test
...
Trying to fix the head masking test
2019-09-10 11:55:27 +02:00
Thomas Wolf
3f05de6dde
Merge branch 'master' into reorder_arguments
2019-09-09 15:42:25 +03:00
Thomas Wolf
5ac8b62265
Merge pull request #1205 from maru0kun/patch-2
...
Fix typo
2019-09-05 21:44:16 +02:00
thomwolf
5c6cac102b
adding test for common properties and cleaning up a bit base class
2019-09-05 21:31:29 +02:00
thomwolf
d77abd4d08
clean ups
2019-09-05 00:41:24 +02:00
thomwolf
2a667b1eb9
split configuration and modeling files
2019-09-05 00:27:11 +02:00
thomwolf
7fba47b7d9
WIP reordering
2019-09-04 22:39:23 +02:00
thomwolf
e25cba78cf
WIP reodering arguments for torchscript and TF
2019-09-04 22:39:23 +02:00
thomwolf
fede4ef45d
fixing #1133
2019-09-02 02:27:39 +02:00
LysandreJik
58b59a0c31
Random seed is accessible anywhere within the common tests
2019-08-31 13:17:08 -04:00
LysandreJik
87747518e9
Blocks deletion from already deleted heads. Necessary integration test.
...
Now raises a warning when a head to be deleted already has been deleted. An integration test verifying the total pipeline (-> from config -> save model -> load model -> additional head pruning) has been added.
2019-08-31 00:33:50 -04:00
LysandreJik
719cb3738d
Pruning for GPT and GPT-2
2019-08-31 00:33:50 -04:00
LysandreJik
fc1fbae45d
XLM can be pruned
2019-08-31 00:33:50 -04:00
Lysandre
42e00cf9e1
Pruning saved to configuration first try
2019-08-31 00:33:50 -04:00
LysandreJik
d7a4c3252e
Fixed filename
2019-08-31 00:08:56 -04:00
LysandreJik
7f006cdd87
Set seed for head_masking test
2019-08-30 23:58:49 -04:00
Thomas Wolf
d483cd8e46
Merge pull request #1074 from huggingface/improved_testing
...
Shortcut to special tokens' ids - fix GPT2 & RoBERTa tokenizers - improved testing for GPT/GPT-2
2019-08-30 23:18:58 +02:00
Thomas Wolf
d2f21f08f5
Merge pull request #1092 from shijie-wu/xlm-tokenization
...
Added cleaned configuration properties for tokenizer with serialization - improve tokenization of XLM
2019-08-30 23:15:40 +02:00
LysandreJik
25e8389439
Tests for added AutoModels
2019-08-30 12:48:55 -04:00
thomwolf
7044ed6b05
fix tokenizers serialization
2019-08-30 17:36:11 +02:00
Thomas Wolf
cd65c41a83
Merge branch 'master' into xlm-tokenization
2019-08-30 17:15:16 +02:00
thomwolf
69da972ace
added test and debug tokenizer configuration serialization
2019-08-30 17:09:36 +02:00
Thomas Wolf
50e615f43d
Merge branch 'master' into improved_testing
2019-08-30 13:40:35 +02:00
thomwolf
ce5ef4b35d
python2 doesn't spark joy
2019-08-30 13:22:43 +02:00
thomwolf
5dd7b677ad
clean up all byte-level bpe tests
2019-08-30 12:43:08 +02:00
thomwolf
ca1a00a302
fix for python2
2019-08-30 12:29:31 +02:00
thomwolf
abe734ca1f
fix GPT-2 and RoBERTa tests to be clean now
2019-08-30 12:20:18 +02:00
thomwolf
d51f72d5de
adding shortcut to the ids of all the special tokens
2019-08-30 11:41:11 +02:00
thomwolf
912a377e90
dilbert -> distilbert
2019-08-28 13:59:42 +02:00
thomwolf
c9bce1811c
fixing model to add torchscript, embedding resizing, head pruning and masking + tests
2019-08-28 13:22:45 +02:00
thomwolf
62df4ba59a
add dilbert tokenizer and tests
2019-08-28 12:22:56 +02:00
LysandreJik
c513415b19
Dilbert tests from CommonTests
2019-08-27 23:59:00 -04:00
Lysandre
55f69a11b6
OpenAI GPT tests now extend CommonTests
2019-08-21 18:09:25 -04:00
Lysandre
47267ba556
OpenAI GPT-2 now depends on CommonTests.
2019-08-21 17:50:16 -04:00
Thomas Wolf
e753f249e1
Merge pull request #806 from wschin/fix-a-path
...
Fix a path so that a test can run on Windows
2019-08-21 01:14:40 +02:00
LysandreJik
3d87991f60
Fixed error with encoding
2019-08-13 12:00:24 -04:00
LysandreJik
634a3172d8
Added integration tests for sequence builders.
2019-08-12 15:14:15 -04:00
LysandreJik
75d5f98fd2
Roberta tokenization + fixed tests (py3 + py2).
2019-08-09 15:02:13 -04:00
LysandreJik
14e970c271
Tokenization encode/decode class-based sequence handling
2019-08-09 15:01:38 -04:00
LysandreJik
fbd746bd06
Updated test architecture
2019-08-08 18:21:34 -04:00
Julien Chaumond
9d0603148b
[RoBERTa] RobertaForSequenceClassification + conversion
2019-08-08 11:24:54 -04:00
LysandreJik
d2cc6b101e
Merge branch 'master' into RoBERTa
2019-08-08 09:42:05 -04:00
LysandreJik
770043eea2
Sentence-pair tasks handling. Using common tests on RoBERTa. Forced push to fix indentation.
2019-08-07 12:53:19 -04:00
thomwolf
0b524b0848
remove derived classes for now
2019-08-05 19:08:19 +02:00
thomwolf
13936a9621
update doc and tests
2019-08-05 18:48:16 +02:00