Add DocumentQuestionAnswering pipeline (#18414)
* [WIP] Skeleton of VisualQuestionAnweringPipeline extended to support LayoutLM-like models * Fixup * Use the full encoding * Basic refactoring to DocumentQuestionAnsweringPipeline * Cleanup * Improve args, docs, and implement preprocessing * Integrate OCR * Refactor question_answering pipeline * Use refactored QA code in the document qa pipeline * Fix tests * Some small cleanups * Use a string type annotation for Image.Image * Update encoding with image features * Wire through the basic docs * Handle invalid response * Handle empty word_boxes properly * Docstring fix * Integrate Donut model * Fixup * Incorporate comments * Address comments * Initial incorporation of tests * Address Comments * Change assert to ValueError * Comments * Wrap `score` in float to make it JSON serializable * Incorporate AutoModeLForDocumentQuestionAnswering changes * Fixup * Rename postprocess function * Fix auto import * Applying comments * Improve docs * Remove extra assets and add copyright * Address comments Co-authored-by: Ankur Goyal <ankur@impira.com>
This commit is contained in:
@@ -25,6 +25,7 @@ There are two categories of pipeline abstractions to be aware about:
|
||||
- [`AudioClassificationPipeline`]
|
||||
- [`AutomaticSpeechRecognitionPipeline`]
|
||||
- [`ConversationalPipeline`]
|
||||
- [`DocumentQuestionAnsweringPipeline`]
|
||||
- [`FeatureExtractionPipeline`]
|
||||
- [`FillMaskPipeline`]
|
||||
- [`ImageClassificationPipeline`]
|
||||
@@ -342,6 +343,12 @@ That should enable you to do all the custom code you want.
|
||||
- __call__
|
||||
- all
|
||||
|
||||
### DocumentQuestionAnsweringPipeline
|
||||
|
||||
[[autodoc]] DocumentQuestionAnsweringPipeline
|
||||
- __call__
|
||||
- all
|
||||
|
||||
### FeatureExtractionPipeline
|
||||
|
||||
[[autodoc]] FeatureExtractionPipeline
|
||||
|
||||
Reference in New Issue
Block a user