support the trocr small models (#14893)

* support the trocr small models

* resolve conflict

* Update docs/source/model_doc/trocr.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/model_doc/trocr.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/model_doc/trocr.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/trocr/processing_trocr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/trocr/processing_trocr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/trocr/processing_trocr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/trocr/processing_trocr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fix unexpected indent in processing_trocr.py

* Update src/transformers/models/trocr/processing_trocr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* update the docstring of processing_trocr

* remove extra space

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
This commit is contained in:
Minghao Li
2022-01-10 22:28:03 +08:00
committed by GitHub
parent 42d57549b8
commit b2c477fc6d
2 changed files with 14 additions and 12 deletions

View File

@@ -55,9 +55,9 @@ Tips:
TrOCR's [`VisionEncoderDecoder`] model accepts images as input and makes use of
[`~generation_utils.GenerationMixin.generate`] to autoregressively generate text given the input image.
The [`ViTFeatureExtractor`] class is responsible for preprocessing the input image and
[`RobertaTokenizer`] decodes the generated target tokens to the target string. The
[`TrOCRProcessor`] wraps [`ViTFeatureExtractor`] and [`RobertaTokenizer`]
The [`ViTFeatureExtractor`/`DeiTFeatureExtractor`] class is responsible for preprocessing the input image and
[`RobertaTokenizer`/`XLMRobertaTokenizer`] decodes the generated target tokens to the target string. The
[`TrOCRProcessor`] wraps [`ViTFeatureExtractor`/`DeiTFeatureExtractor`] and [`RobertaTokenizer`/`XLMRobertaTokenizer`]
into a single instance to both extract the input features and decode the predicted token ids.
- Step-by-step Optical Character Recognition (OCR)