HuggingFace_transformer

Files

Nicolas Patry d8fc26e919 NerPipeline (TokenClassification) now outputs offsets of words (#8781 )

* NerPipeline (TokenClassification) now outputs offsets of words

- It happens that the offsets are missing, it forces the user to pattern
match the "word" from his input, which is not always feasible.
For instance if a sentence contains the same word twice, then there
is no way to know which is which.
- This PR proposes to fix that by outputting 2 new keys for this
pipelines outputs, "start" and "end", which correspond to the string
offsets of the word. That means that we should always have the
invariant:

```python
input[entity["start"]: entity["end"]] == entity["entity_group"]
                                    # or entity["entity"] if not grouped
```

* Fixing doc style

2020-11-30 14:05:08 -05:00

fixtures

Add new token classification example (#8340 )

2020-11-09 11:39:55 -05:00

__init__.py

GPU text generation: mMoved the encoded_prompt to correct device

2020-01-06 15:11:12 +01:00

conftest.py

[CIs] Better reports everywhere (#8275 )

2020-11-03 16:57:12 -05:00

test_activations_tf.py

Replace swish with silu (#8166 )

2020-10-30 15:09:10 -04:00

test_activations.py

Replace swish with silu (#8166 )

2020-10-30 15:09:10 -04:00

test_benchmark_tf.py

[Benchmarks] Change all args to from no_... to their positive form (#7075 )

2020-09-23 13:25:24 -04:00

test_benchmark.py

[Benchmarks] Change all args to from no_... to their positive form (#7075 )

2020-09-23 13:25:24 -04:00

test_cli.py

[transformers-cli] fix logger getter (#6777 )

2020-08-27 20:01:17 -04:00

test_configuration_auto.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_configuration_common.py

[PretrainedConfig] Fix save pretrained config for edge case (#7943 )

2020-10-22 15:39:01 +02:00

test_data_collator.py

Clean up data collators and datasets (#8308 )

2020-11-04 17:24:49 -05:00

test_doc_samples.py

Fix ignore list behavior in doctests (#8213 )

2020-11-02 08:47:37 -05:00

test_file_utils.py

Model versioning (#8324 )

2020-11-10 07:11:02 -05:00

test_flax_auto.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_generation_beam_search.py

Refactoring the generate() function (#6949 )

2020-11-03 16:04:22 +01:00

test_generation_logits_process.py

Adding PrefixConstrainedLogitsProcessor (#8529 )

2020-11-18 17:06:25 +01:00

test_generation_utils.py

fix flaky ci (#8694 )

2020-11-20 22:07:21 +01:00

test_hf_api.py

Skip test until investigation

2020-11-11 12:59:40 -05:00

test_hf_argparser.py

Smarter prediction loop and no- -> no_ in console args (#8151 )

2020-10-29 10:56:25 -04:00

test_logging.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_model_card.py

GPU text generation: mMoved the encoded_prompt to correct device

2020-01-06 15:11:12 +01:00

test_model_output.py

Add tests and fix various bugs in ModelOutput (#7073 )

2020-09-11 12:01:33 -04:00

test_modeling_albert.py

Support various BERT relative position embeddings (2nd) (#8276 )

2020-11-24 14:40:53 +01:00

test_modeling_auto.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_modeling_bart.py

Fix slow tests v2 (#8746 )

2020-11-24 09:35:12 -05:00

test_modeling_bert_generation.py

Switch return_dict to True by default. (#8530 )

2020-11-16 11:43:00 -05:00

test_modeling_bert.py

Support various BERT relative position embeddings (2nd) (#8276 )

2020-11-24 14:40:53 +01:00

test_modeling_blenderbot.py

Refactoring the generate() function (#6949 )

2020-11-03 16:04:22 +01:00

test_modeling_camembert.py

Switch return_dict to True by default. (#8530 )

2020-11-16 11:43:00 -05:00

test_modeling_common.py

Model parallel tests should return, not pass in non model parallel settings. (#8825 )

2020-11-27 16:41:29 -05:00

test_modeling_ctrl.py

Switch return_dict to True by default. (#8530 )

2020-11-16 11:43:00 -05:00

test_modeling_deberta.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_modeling_distilbert.py

Switch return_dict to True by default. (#8530 )

2020-11-16 11:43:00 -05:00

test_modeling_dpr.py

Fix dpr<>bart config for RAG (#8808 )

2020-11-27 16:26:45 +01:00

test_modeling_electra.py

Support various BERT relative position embeddings (2nd) (#8276 )

2020-11-24 14:40:53 +01:00

test_modeling_encoder_decoder.py

Fix bug in x-attentions output for roberta and harden test to catch it (#8660 )

2020-11-23 13:28:29 +01:00

test_modeling_flaubert.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_modeling_flax_bert.py

Attempt to fix Flax CI error(s) (#8829 )

2020-11-30 13:43:17 -05:00

test_modeling_flax_roberta.py

Attempt to fix Flax CI error(s) (#8829 )

2020-11-30 13:43:17 -05:00

test_modeling_fsmt.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_modeling_funnel.py

Switch return_dict to True by default. (#8530 )

2020-11-16 11:43:00 -05:00

test_modeling_gpt2.py

gpt2 and t5 parallel modeling (#8696 )

2020-11-23 14:41:23 -05:00

test_modeling_layoutlm.py

Support various BERT relative position embeddings (2nd) (#8276 )

2020-11-24 14:40:53 +01:00

test_modeling_longformer.py

Return correct Bart hidden state tensors (#8747 )

2020-11-25 22:06:04 +01:00

test_modeling_lxmert.py

Return correct Bart hidden state tensors (#8747 )

2020-11-25 22:06:04 +01:00

test_modeling_marian.py

consistent ignore keys + make private (#8737 )

2020-11-23 12:33:13 -08:00

test_modeling_mbart.py

Fix slow tests v2 (#8746 )

2020-11-24 09:35:12 -05:00

test_modeling_mobilebert.py

Switch return_dict to True by default. (#8530 )

2020-11-16 11:43:00 -05:00

test_modeling_mt5.py

T5 & mT5 (#8552 )

2020-11-17 12:23:09 +01:00

test_modeling_openai.py

Switch return_dict to True by default. (#8530 )

2020-11-16 11:43:00 -05:00

test_modeling_pegasus.py

consistent ignore keys + make private (#8737 )

2020-11-23 12:33:13 -08:00

test_modeling_prophetnet.py

Return correct Bart hidden state tensors (#8747 )

2020-11-25 22:06:04 +01:00

test_modeling_rag.py

Add T5 Encoder for Feature Extraction (#8717 )

2020-11-30 08:34:40 +01:00

test_modeling_reformer.py

Return correct Bart hidden state tensors (#8747 )

2020-11-25 22:06:04 +01:00

test_modeling_roberta.py

Support various BERT relative position embeddings (2nd) (#8276 )

2020-11-24 14:40:53 +01:00

test_modeling_squeezebert.py

Switch return_dict to True by default. (#8530 )

2020-11-16 11:43:00 -05:00

test_modeling_t5.py

Add T5 Encoder for Feature Extraction (#8717 )

2020-11-30 08:34:40 +01:00

test_modeling_tf_albert.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_modeling_tf_auto.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_modeling_tf_bart.py

New TF model inputs (#8602 )

2020-11-24 13:55:00 -05:00

test_modeling_tf_bert.py

TF BERT test update

2020-11-23 18:20:19 -05:00

test_modeling_tf_blenderbot.py

New TF model inputs (#8602 )

2020-11-24 13:55:00 -05:00

test_modeling_tf_camembert.py

Switch return_dict to True by default. (#8530 )

2020-11-16 11:43:00 -05:00

test_modeling_tf_common.py

New TF model inputs (#8602 )

2020-11-24 13:55:00 -05:00

test_modeling_tf_ctrl.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_modeling_tf_distilbert.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_modeling_tf_dpr.py

Fix a bunch of slow tests (#8634 )

2020-11-19 10:41:41 -05:00

test_modeling_tf_electra.py

Fix a bunch of slow tests (#8634 )

2020-11-19 10:41:41 -05:00

test_modeling_tf_flaubert.py

Switch return_dict to True by default. (#8530 )

2020-11-16 11:43:00 -05:00

test_modeling_tf_funnel.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_modeling_tf_gpt2.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_modeling_tf_longformer.py

Tf longformer for sequence classification (#8231 )

2020-11-19 10:37:27 -05:00

test_modeling_tf_lxmert.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_modeling_tf_marian.py

TFMarian, TFMbart, TFPegasus, TFBlenderbot (#7987 )

2020-10-30 11:23:16 -04:00

test_modeling_tf_mbart.py

TFMarian, TFMbart, TFPegasus, TFBlenderbot (#7987 )

2020-10-30 11:23:16 -04:00

test_modeling_tf_mobilebert.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_modeling_tf_mt5.py

T5 & mT5 (#8552 )

2020-11-17 12:23:09 +01:00

test_modeling_tf_openai.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_modeling_tf_pegasus.py

TFMarian, TFMbart, TFPegasus, TFBlenderbot (#7987 )

2020-10-30 11:23:16 -04:00

test_modeling_tf_pytorch.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_modeling_tf_roberta.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_modeling_tf_t5.py

Add T5 Encoder for Feature Extraction (#8717 )

2020-11-30 08:34:40 +01:00

test_modeling_tf_transfo_xl.py

Switch return_dict to True by default. (#8530 )

2020-11-16 11:43:00 -05:00

test_modeling_tf_xlm_roberta.py

Switch return_dict to True by default. (#8530 )

2020-11-16 11:43:00 -05:00

test_modeling_tf_xlm.py

Switch return_dict to True by default. (#8530 )

2020-11-16 11:43:00 -05:00

test_modeling_tf_xlnet.py

[XLNet] Fix mems behavior (#8567 )

2020-11-25 16:54:59 -05:00

test_modeling_transfo_xl.py

Return correct Bart hidden state tensors (#8747 )

2020-11-25 22:06:04 +01:00

test_modeling_xlm_prophetnet.py

Ci test tf super slow (#8007 )

2020-10-30 10:25:48 -04:00

test_modeling_xlm_roberta.py

Switch return_dict to True by default. (#8530 )

2020-11-16 11:43:00 -05:00

test_modeling_xlm.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_modeling_xlnet.py

[XLNet] Fix mems behavior (#8567 )

2020-11-25 16:54:59 -05:00

test_onnx.py

Ci test tf super slow (#8007 )

2020-10-30 10:25:48 -04:00

test_optimization_tf.py

Update repo to isort v5 (#6686 )

2020-08-24 11:03:01 -04:00

test_optimization.py

Format

2020-08-27 18:31:51 +02:00

test_pipelines_common.py

[breaking|pipelines|tokenizers] Adding slow-fast tokenizers equivalence tests pipelines - Removing sentencepiece as a required dependency (#8073 )

2020-11-15 22:50:59 +01:00

test_pipelines_conversational.py

Updated ConversationalPipeline to work with encoder-decoder models (#8207 )

2020-11-03 10:33:01 -05:00

test_pipelines_feature_extraction.py

[tests|tokenizers] Refactoring pipelines test backbone - Small tokenizers improvements - General tests speedups (#7970 )

2020-10-23 15:58:19 +02:00

test_pipelines_fill_mask.py

Remove deprecated (#8604 )

2020-11-17 15:11:29 -05:00

test_pipelines_ner.py

NerPipeline (TokenClassification) now outputs offsets of words (#8781 )

2020-11-30 14:05:08 -05:00

test_pipelines_question_answering.py

Fix QA argument handler (#8765 )

2020-11-25 14:02:15 -05:00

test_pipelines_sentiment_analysis.py

[tests|tokenizers] Refactoring pipelines test backbone - Small tokenizers improvements - General tests speedups (#7970 )

2020-10-23 15:58:19 +02:00

test_pipelines_summarization.py

[tests|tokenizers] Refactoring pipelines test backbone - Small tokenizers improvements - General tests speedups (#7970 )

2020-10-23 15:58:19 +02:00

test_pipelines_text2text_generation.py

[tests|tokenizers] Refactoring pipelines test backbone - Small tokenizers improvements - General tests speedups (#7970 )

2020-10-23 15:58:19 +02:00

test_pipelines_text_generation.py

[FIX] TextGenerationPipeline is currently broken. (#8256 )

2020-11-03 10:10:22 -05:00

test_pipelines_translation.py

[tests|tokenizers] Refactoring pipelines test backbone - Small tokenizers improvements - General tests speedups (#7970 )

2020-10-23 15:58:19 +02:00

test_pipelines_zero_shot.py

[breaking|pipelines|tokenizers] Adding slow-fast tokenizers equivalence tests pipelines - Removing sentencepiece as a required dependency (#8073 )

2020-11-15 22:50:59 +01:00

test_retrieval_rag.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_skip_decorators.py

[testing] rename skip targets + docs (#7863 )

2020-10-20 04:39:13 -04:00

test_tokenization_albert.py

[Dependencies|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659 )

2020-10-18 20:51:24 +02:00

test_tokenization_auto.py

MT5 should have an autotokenizer (#8743 )

2020-11-24 09:50:25 -05:00

test_tokenization_bart.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_tokenization_barthez.py

Add barthez model (#8393 )

2020-11-27 12:31:42 -05:00

test_tokenization_bert_generation.py

[Dependencies|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659 )

2020-10-18 20:51:24 +02:00

test_tokenization_bert_japanese.py

Improve bert-japanese tokenizer handling (#8659 )

2020-11-23 11:15:02 -05:00

test_tokenization_bert.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_tokenization_bertweet.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_tokenization_blenderbot.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_tokenization_camembert.py

[Dependencies|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659 )

2020-10-18 20:51:24 +02:00

test_tokenization_common.py

Tokenizers should be framework agnostic (#8599 )

2020-11-17 14:03:03 -05:00

test_tokenization_ctrl.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_tokenization_deberta.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_tokenization_distilbert.py

[Dependencies|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659 )

2020-10-18 20:51:24 +02:00

test_tokenization_dpr.py

[Dependencies|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659 )

2020-10-18 20:51:24 +02:00

test_tokenization_fsmt.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_tokenization_funnel.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_tokenization_gpt2.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_tokenization_herbert.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_tokenization_layoutlm.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_tokenization_lxmert.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_tokenization_marian.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_tokenization_mbart.py

Add sentencepiece to the CI and fix tests (#8672 )

2020-11-19 16:44:20 -05:00

test_tokenization_openai.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_tokenization_pegasus.py

[Pegasus] Refactor Tokenizer (#8731 )

2020-11-29 16:57:43 +01:00

test_tokenization_phobert.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_tokenization_prophetnet.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_tokenization_rag.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_tokenization_reformer.py

[Reformer] remove reformer pad_token_id (#7991 )

2020-10-23 10:29:15 -04:00

test_tokenization_roberta.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_tokenization_squeezebert.py

[Dependencies|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659 )

2020-10-18 20:51:24 +02:00

test_tokenization_t5.py

fix t5 token type ids (#8437 )

2020-11-10 14:21:54 -05:00

test_tokenization_transfo_xl.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_tokenization_utils.py

[tokenizers] convert_to_tensors: don't reconvert when the type is already right (#8283 )

2020-11-19 12:06:01 -08:00

test_tokenization_xlm_prophetnet.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_tokenization_xlm_roberta.py

[Dependencies|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659 )

2020-10-18 20:51:24 +02:00

test_tokenization_xlm.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_tokenization_xlnet.py

[Dependencies|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659 )

2020-10-18 20:51:24 +02:00

test_trainer_callback.py

Fix a bug for CallbackHandler.callback_list (#8052 )

2020-10-27 10:37:04 -04:00

test_trainer_distributed.py

using multi_gpu consistently (#8446 )

2020-11-10 13:23:58 -05:00

test_trainer_tpu.py

Add predict step accumulation (#7767 )

2020-10-14 11:41:45 -04:00

test_trainer_utils.py

Add predict step accumulation (#7767 )

2020-10-14 11:41:45 -04:00

test_trainer.py

Add early stopping callback to pytorch trainer (#8581 )

2020-11-23 17:25:35 -05:00

test_utils_check_copies.py

Reorganize repo (#8580 )

2020-11-16 21:43:42 -05:00

test_versions_utils.py

[core] implement support for run-time dependency version checking (#8645 )

2020-11-24 13:22:25 -05:00