Fixing issue where generic model types wouldn't load properly with the pipeline (#18392)

* Adding a better error message when the model is improperly configured within transformers. * Update src/transformers/pipelines/__init__.py * Black version. * Overriding task aliases so that tokenizer+feature_extractor values are correct. * Fixing task aliases by overriding their names early * X. * Fixing feature-extraction. * black again. * Normalizing `translation` too. * Fixing last few corner cases. translation need to use its non normalized name (translation_XX_to_YY, so that the task_specific_params are correctly overloaded). This can be removed and cleaned up in a later PR. `speech-encode-decoder` actually REQUIRES to pass a `tokenizer` manually so the error needs to be discarded when the `tokenizer` is already there. * doc-builder fix. * Fixing the real issue. * Removing dead code. * Do not import the actual config classes.
2022-08-05 08:45:07 +02:00
parent 14928921e2
commit 586dcf6b21
3 changed files with 38 additions and 16 deletions
--- a/tests/pipelines/test_pipelines_automatic_speech_recognition.py
+++ b/tests/pipelines/test_pipelines_automatic_speech_recognition.py
@@ -141,15 +141,8 @@ class AutomaticSpeechRecognitionPipelineTests(unittest.TestCase, metaclass=Pipel

    @require_torch
    def test_small_model_pt_seq2seq(self):
-        model_id = "hf-internal-testing/tiny-random-speech-encoder-decoder"
-        tokenizer = AutoTokenizer.from_pretrained(model_id)
-        feature_extractor = AutoFeatureExtractor.from_pretrained(model_id)
-
        speech_recognizer = pipeline(
-            task="automatic-speech-recognition",
-            model=model_id,
-            tokenizer=tokenizer,
-            feature_extractor=feature_extractor,
+            model="hf-internal-testing/tiny-random-speech-encoder-decoder",
            framework="pt",
        )