Sam Shleifer
|
f5c2a122e3
|
Upgrade examples to pl=0.8.1(#5146)
|
2020-06-22 20:40:10 -04:00 |
|
Sam Shleifer
|
2db1e2f415
|
[cleanup] remove redundant code in SummarizationDataset (#5119)
|
2020-06-18 20:34:48 -04:00 |
|
Sam Shleifer
|
043f9f51f9
|
[examples] SummarizationModule improvements (#4951)
|
2020-06-17 13:51:34 -04:00 |
|
Anthony MOI
|
36434220fc
|
[HUGE] Refactoring tokenizers backend - padding - truncation - pre-tokenized pipeline - fast tokenizers - tests (#4510)
* Use tokenizers pre-tokenized pipeline
* failing pretrokenized test
* Fix is_pretokenized in python
* add pretokenized tests
* style and quality
* better tests for batched pretokenized inputs
* tokenizers clean up - new padding_strategy - split the files
* [HUGE] refactoring tokenizers - padding - truncation - tests
* style and quality
* bump up requied tokenizers version to 0.8.0-rc1
* switched padding/truncation API - simpler better backward compat
* updating tests for custom tokenizers
* style and quality - tests on pad
* fix QA pipeline
* fix backward compatibility for max_length only
* style and quality
* Various cleans up - add verbose
* fix tests
* update docstrings
* Fix tests
* Docs reformatted
* __call__ method documented
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
|
2020-06-15 17:12:51 -04:00 |
|
Amil Khare
|
02e5f79662
|
[examples] consolidate summarization examples (#4837)
|
2020-06-09 11:14:12 -04:00 |
|