[LayoutLMv3] Add TensorFlow implementation (#18678)
Co-authored-by: Esben Toke Christensen <esben.christensen@visma.com> Co-authored-by: Lasse Reedtz <lasse.reedtz@visma.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
This commit is contained in:
committed by
GitHub
parent
7320d95d98
commit
de8548ebf3
@@ -26,18 +26,18 @@ Tips:
|
||||
|
||||
- In terms of data processing, LayoutLMv3 is identical to its predecessor [LayoutLMv2](layoutlmv2), except that:
|
||||
- images need to be resized and normalized with channels in regular RGB format. LayoutLMv2 on the other hand normalizes the images internally and expects the channels in BGR format.
|
||||
- text is tokenized using byte-pair encoding (BPE), as opposed to WordPiece.
|
||||
- text is tokenized using byte-pair encoding (BPE), as opposed to WordPiece.
|
||||
Due to these differences in data preprocessing, one can use [`LayoutLMv3Processor`] which internally combines a [`LayoutLMv3FeatureExtractor`] (for the image modality) and a [`LayoutLMv3Tokenizer`]/[`LayoutLMv3TokenizerFast`] (for the text modality) to prepare all data for the model.
|
||||
- Regarding usage of [`LayoutLMv3Processor`], we refer to the [usage guide](layoutlmv2#usage-layoutlmv2processor) of its predecessor.
|
||||
- Regarding usage of [`LayoutLMv3Processor`], we refer to the [usage guide](layoutlmv2#usage-layoutlmv2processor) of its predecessor.
|
||||
- Demo notebooks for LayoutLMv3 can be found [here](https://github.com/NielsRogge/Transformers-Tutorials/tree/master/LayoutLMv3).
|
||||
- Demo scripts can be found [here](https://github.com/huggingface/transformers/tree/main/examples/research_projects/layoutlmv3).
|
||||
|
||||
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/layoutlmv3_architecture.png"
|
||||
alt="drawing" width="600"/>
|
||||
alt="drawing" width="600"/>
|
||||
|
||||
<small> LayoutLMv3 architecture. Taken from the <a href="https://arxiv.org/abs/2204.08387">original paper</a>. </small>
|
||||
|
||||
This model was contributed by [nielsr](https://huggingface.co/nielsr). The original code can be found [here](https://github.com/microsoft/unilm/tree/master/layoutlmv3).
|
||||
This model was contributed by [nielsr](https://huggingface.co/nielsr). The TensorFlow version of this model was added by [chriskoo](https://huggingface.co/chriskoo), [tokec](https://huggingface.co/tokec), and [lre](https://huggingface.co/lre). The original code can be found [here](https://github.com/microsoft/unilm/tree/master/layoutlmv3).
|
||||
|
||||
|
||||
## LayoutLMv3Config
|
||||
@@ -84,3 +84,23 @@ This model was contributed by [nielsr](https://huggingface.co/nielsr). The origi
|
||||
|
||||
[[autodoc]] LayoutLMv3ForQuestionAnswering
|
||||
- forward
|
||||
|
||||
## TFLayoutLMv3Model
|
||||
|
||||
[[autodoc]] TFLayoutLMv3Model
|
||||
- call
|
||||
|
||||
## TFLayoutLMv3ForSequenceClassification
|
||||
|
||||
[[autodoc]] TFLayoutLMv3ForSequenceClassification
|
||||
- call
|
||||
|
||||
## TFLayoutLMv3ForTokenClassification
|
||||
|
||||
[[autodoc]] TFLayoutLMv3ForTokenClassification
|
||||
- call
|
||||
|
||||
## TFLayoutLMv3ForQuestionAnswering
|
||||
|
||||
[[autodoc]] TFLayoutLMv3ForQuestionAnswering
|
||||
- call
|
||||
|
||||
Reference in New Issue
Block a user