Allow FP16 or other precision inference for Pipelines (#31342)
* cast image features to model.dtype where needed to support FP16 or other precision in pipelines * Update src/transformers/pipelines/image_feature_extraction.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Use .to instead * Add FP16 pipeline support for zeroshot audio classification * Remove unused torch imports * Add docs on FP16 pipeline * Remove unused import * Add FP16 tests to pipeline mixin * Add fp16 placeholder for mask_generation pipeline test * Add FP16 tests for all pipelines * Fix formatting * Remove torch_dtype arg from is_pipeline_test_to_skip* * Fix format * trigger ci --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
This commit is contained in:
@@ -113,7 +113,9 @@ This will work regardless of whether you are using PyTorch or Tensorflow.
|
||||
transcriber = pipeline(model="openai/whisper-large-v2", device=0)
|
||||
```
|
||||
|
||||
If the model is too large for a single GPU and you are using PyTorch, you can set `device_map="auto"` to automatically
|
||||
If the model is too large for a single GPU and you are using PyTorch, you can set `torch_dtype='float16'` to enable FP16 precision inference. Usually this would not cause significant performance drops but make sure you evaluate it on your models!
|
||||
|
||||
Alternatively, you can set `device_map="auto"` to automatically
|
||||
determine how to load and store the model weights. Using the `device_map` argument requires the 🤗 [Accelerate](https://huggingface.co/docs/accelerate)
|
||||
package:
|
||||
|
||||
@@ -342,4 +344,3 @@ gr.Interface.from_pipeline(pipe).launch()
|
||||
|
||||
By default, the web demo runs on a local server. If you'd like to share it with others, you can generate a temporary public
|
||||
link by setting `share=True` in `launch()`. You can also host your demo on [Hugging Face Spaces](https://huggingface.co/spaces) for a permanent link.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user