Files
HuggingFace_transformer/tests
Nicolas Patry d4be498441 Optimizing away the fill-mask pipeline. (#12113)
* Optimizing away the `fill-mask` pipeline.

- Don't send anything to the tokenizer unless needed. Vocab check is
much faster
- Keep BC by sending data to the tokenizer when needed. User handling warning messages will see performance benefits again
- Make `targets` and `top_k` work together better `top_k` cannot be
higher than `len(targets)` but can be smaller still.
- Actually simplify the `target_ids` in case of duplicate (it can happen
because we're parsing raw strings)
- Removed useless code to fail on empty strings. It works only if empty
string is in first position, moved to ignoring them instead.
- Changed the related tests as only the tests would fail correctly
(having incorrect value in first position)

* Make tests compatible for 2 different vocabs... (at the price of a
warning).

Co-authored-by: @EtaoinWu

* ValueError working globally

* Update src/transformers/pipelines/fill_mask.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* `tokenizer.vocab` -> `tokenizer.get_vocab()` for more compatiblity +
fallback.

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-06-23 10:38:04 +02:00
..
2021-06-09 11:51:13 -04:00
2021-04-23 09:17:37 -04:00
2020-12-07 18:36:34 -05:00
2020-12-07 18:36:34 -05:00
2020-12-07 18:36:34 -05:00
2020-12-07 18:36:34 -05:00
2021-05-12 13:48:15 +05:30
2021-04-23 09:17:37 -04:00
2021-01-27 21:25:11 +03:00
2020-12-07 18:36:34 -05:00
2021-06-09 11:51:13 -04:00
2021-06-16 12:14:12 +01:00
2021-04-26 13:50:34 +02:00
2021-04-26 13:50:34 +02:00
2020-12-09 10:32:43 -05:00
2020-12-07 18:36:34 -05:00
2020-12-07 18:36:34 -05:00
2021-06-01 19:07:37 +01:00
2021-01-27 21:25:11 +03:00
2021-05-05 12:38:01 +02:00
2021-06-01 19:07:37 +01:00
2020-12-07 18:36:34 -05:00
2021-05-12 13:48:15 +05:30
2021-06-01 19:07:37 +01:00
2020-12-07 18:36:34 -05:00
2020-12-07 18:36:34 -05:00
2021-04-26 13:50:34 +02:00
2021-04-21 11:11:20 -04:00