Fixing issue where generic model types wouldn't load properly with the pipeline (#18392)

* Adding a better error message when the model is improperly configured

within transformers.

* Update src/transformers/pipelines/__init__.py

* Black version.

* Overriding task aliases so that tokenizer+feature_extractor

values are correct.

* Fixing task aliases by overriding their names early

* X.

* Fixing feature-extraction.

* black again.

* Normalizing `translation` too.

* Fixing last few corner cases.

translation need to use its non normalized name (translation_XX_to_YY,
so that the task_specific_params are correctly overloaded).
This can be removed and cleaned up in a later PR.

`speech-encode-decoder` actually REQUIRES to pass a `tokenizer` manually
so the error needs to be discarded when the `tokenizer` is already
there.

* doc-builder fix.

* Fixing the real issue.

* Removing dead code.

* Do not import the actual config classes.
This commit is contained in:
Nicolas Patry
2022-08-05 08:45:07 +02:00
committed by GitHub
parent 14928921e2
commit 586dcf6b21
3 changed files with 38 additions and 16 deletions

View File

@@ -141,15 +141,8 @@ class AutomaticSpeechRecognitionPipelineTests(unittest.TestCase, metaclass=Pipel
@require_torch
def test_small_model_pt_seq2seq(self):
model_id = "hf-internal-testing/tiny-random-speech-encoder-decoder"
tokenizer = AutoTokenizer.from_pretrained(model_id)
feature_extractor = AutoFeatureExtractor.from_pretrained(model_id)
speech_recognizer = pipeline(
task="automatic-speech-recognition",
model=model_id,
tokenizer=tokenizer,
feature_extractor=feature_extractor,
model="hf-internal-testing/tiny-random-speech-encoder-decoder",
framework="pt",
)