sshleifer
cbcb83f21d
minor cleanup of test_attention_outputs
2020-02-04 16:38:52 -05:00
Lysandre
1e82cd8457
Flaubert auto tokenizer + tests
...
cc @julien-c
2020-01-31 14:16:52 -05:00
Julien Chaumond
9fa836a73f
fill_mask helper ( #2576 )
...
* fill_mask helper
* [poc] FillMaskPipeline
* Revert "[poc] FillMaskPipeline"
This reverts commit 67eeea55b0f97b46c2b828de0f4ee97d87338335.
* Revert "fill_mask helper"
This reverts commit cacc17b884e14bb6b07989110ffe884ad9e36eaa.
* README: clarify that Pipelines can also do text-classification
cf. question at the AI&ML meetup last week, @mfuntowicz
* Fix test: test feature-extraction pipeline
* Test tweaks
* Slight refactor of existing pipeline (in preparation of new FillMaskPipeline)
* Extraneous doc
* More robust way of doing this
@mfuntowicz as we don't rely on the model name anymore (see AutoConfig)
* Also add RobertaConfig as a quickfix for wrong token_type_ids
* cs
* [BIG] FillMaskPipeline
2020-01-30 18:15:42 -05:00
Lysandre
df27648bd9
Rename test_examples to test_doc_samples
2020-01-30 10:07:22 -05:00
Lysandre
e63a81dd25
Style
2020-01-29 16:29:20 -05:00
Lysandre
217349016a
Copy object instead of passing the reference
2020-01-29 16:15:39 -05:00
Lysandre
ea2600bd5f
Absolute definitive HeisenDistilBug solve
...
cc @julien-c @thomwolf
2020-01-27 21:58:36 -05:00
thomwolf
0e31e06a75
Add AutoModelForPreTraining
2020-01-27 14:27:07 -05:00
Lysandre
875c4ae48f
Definitive HeisenDistilBug fix
...
cc @julien-c @@thomwolf
2020-01-27 12:09:58 -05:00
Lysandre
24d5ad1dcc
Run the examples in slow
2020-01-23 09:38:45 -05:00
Lysandre
f81b6c95f2
Flake8 violation
2020-01-23 09:38:45 -05:00
Lysandre
632675ea88
Can test examples spread over multiple blocks
2020-01-23 09:38:45 -05:00
Lysandre
eaa6b9afc6
Require Torch when testing examples
2020-01-23 09:38:45 -05:00
Lysandre
64abd3e0aa
Multi-line examples can be tested + ALBERT patch for CircleCI
...
All tests should now work fine.
2020-01-23 09:38:45 -05:00
Lysandre
837577256b
Automatic testing of examples
...
The CircleCI test should fail.
2020-01-23 09:38:45 -05:00
Mark Neumann
65a89a8976
Fix BasicTokenizer to respect never_split parameters ( #2557 )
...
* add failing test
* fix call to _run_split_on_punc
* format with black
2020-01-17 14:57:56 -05:00
Julien Chaumond
23a2cea8cb
Tokenizer.from_pretrained: fetch all possible files remotely
2020-01-16 16:47:19 -05:00
Julien Chaumond
9d8fd2d40e
tokenizer.save_pretrained: only save file if non-empty
2020-01-16 16:47:19 -05:00
Thomas Wolf
dc17f2a111
Merge pull request #2538 from huggingface/py3_super
...
💄 super
2020-01-16 13:17:15 +01:00
Julien Chaumond
d9fa1bad72
Fix failing torchscript test for xlnet
...
model.parameters() order is apparently not stable (only for xlnet, for some reason)
2020-01-15 20:22:21 -05:00
Julien Chaumond
83a41d39b3
💄 super
2020-01-15 18:33:50 -05:00
Julien Chaumond
eb59e9f705
Graduate sst-2 to a canonical one
2020-01-15 16:28:50 +00:00
Julien Chaumond
e184ad13cf
Close #2392
2020-01-15 15:43:44 +00:00
Julien Chaumond
715fa638a7
Merge branch 'master' into from_scratch_training
2020-01-14 18:58:21 +00:00
Lysandre
100e3b6f21
Bias should be resized with the weights
...
Created a link between the linear layer bias and the model attribute bias. This does not change anything for the user nor for the conversion scripts, but allows the `resize_token_embeddings` method to resize the bias as well as the weights of the decoder.
Added a test.
2020-01-14 13:43:45 -05:00
Julien Chaumond
764f836d52
Update test_tokenization_auto.py
2020-01-13 22:50:34 -05:00
Julien Chaumond
d5831acb07
Update test_tokenization_auto.py
2020-01-13 22:47:33 -05:00
Julien Chaumond
ed6cd597cc
Update test_tokenization_auto.py
2020-01-13 22:46:35 -05:00
Julien Chaumond
5cb463a714
Update test_tokenization_auto.py
2020-01-13 22:38:29 -05:00
Julien Chaumond
0304628590
Map configs to models and tokenizers
2020-01-13 23:11:44 +00:00
Julien Chaumond
1fc855e456
[tests] Safety checks on CONFIG_MAPPING
2020-01-13 21:52:55 +00:00
Julien Chaumond
cf8a70bf68
More AutoConfig tests
2020-01-11 03:43:57 +00:00
Julien Chaumond
c6f682c1eb
flake
2020-01-11 03:18:31 +00:00
Julien Chaumond
4d1c98c012
AutoConfig + other Auto classes honor model_type
2020-01-11 02:46:17 +00:00
Julien Chaumond
2f32dfd33b
Convention: name mixins mixins
2020-01-11 01:24:29 +00:00
Julien Chaumond
055e80cfad
rm old ConfigTester
2020-01-10 21:36:18 +00:00
Julien Chaumond
84c0aa1868
num_parameters helper
2020-01-10 17:40:02 +00:00
alberduris
81d6841b4b
GPU text generation: mMoved the encoded_prompt to correct device
2020-01-06 15:11:12 +01:00
alberduris
dd4df80f0b
Moved the encoded_prompts to correct device
2020-01-06 15:11:12 +01:00
Aymeric Augustin
0ffc8eaf53
Enforce target version for black.
...
This should stabilize formatting.
2020-01-05 12:52:14 -05:00
Julien Chaumond
594ca6dead
[debug] Debug Heisenbug, the old school way.
2019-12-29 10:07:21 -05:00
Julien Chaumond
f78ebc22ad
[cli] Add ability to delete remote object
2019-12-27 22:53:49 -05:00
Thomas Wolf
9f5f646442
Merge pull request #2211 from huggingface/fast-tokenizers
...
Fast tokenizers
2019-12-27 10:24:29 +01:00
Anthony MOI
2818e50569
Add tests for fast tokenizers
2019-12-24 13:29:01 -05:00
Aymeric Augustin
e6c0019c80
Remove unused variables in tests.
2019-12-23 22:38:18 +01:00
Aymeric Augustin
1c62e87b34
Use built-in open().
...
On Python 3, `open is io.open`.
2019-12-22 18:38:56 +01:00
Aymeric Augustin
798b3b3899
Remove sys.version_info[0] == 2 or 3.
2019-12-22 18:38:42 +01:00
Aymeric Augustin
8af25b1664
Remove six.
2019-12-22 17:56:09 +01:00
Aymeric Augustin
c824d15aa1
Remove __future__ imports.
2019-12-22 17:47:54 +01:00
Aymeric Augustin
00204f2b4c
Replace CommonTestCases for tokenizers with a mixin.
...
This is the same change as for (TF)CommonTestCases for modeling.
2019-12-22 15:35:25 +01:00