Add LayoutLM Model (#7064)

* first version * finish test docs readme model/config/tokenization class * apply make style and make quality * fix layoutlm GitHub link * fix conflict in index.rst and add layoutlm to pretrained_models.rst * fix bug in test_parents_and_children_in_mappings * reformat modeling_auto.py and tokenization_auto.py * fix bug in test_modeling_layoutlm.py * Update docs/source/model_doc/layoutlm.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/layoutlm.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * remove inh, add tokenizer fast, and update some doc * copy and rename necessary class from modeling_bert to modeling_layoutlm * Update src/transformers/configuration_layoutlm.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/configuration_layoutlm.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/configuration_layoutlm.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/configuration_layoutlm.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_layoutlm.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_layoutlm.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_layoutlm.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * add mish to activations.py, import ACT2FN and import logging from utils Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-09-22 21:28:02 +08:00
parent 244e1b5ba3
commit cd9a0585ea
14 changed files with 1510 additions and 3 deletions
--- a/README.md
+++ b/README.md
@@ -183,8 +183,9 @@ Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih.
 24. **[MBart](https://github.com/pytorch/fairseq/tree/master/examples/mbart)** (from Facebook) released with the paper  [Multilingual Denoising Pre-training for Neural Machine Translation](https://arxiv.org/abs/2001.08210) by Yinhan Liu, Jiatao Gu, Naman Goyal, Xian Li, Sergey Edunov, Marjan Ghazvininejad, Mike Lewis, Luke Zettlemoyer.  
 25. **[LXMERT](https://github.com/airsplay/lxmert)** (from UNC Chapel Hill) released with the paper [LXMERT: Learning Cross-Modality Encoder Representations from Transformers for Open-Domain Question Answering](https://arxiv.org/abs/1908.07490) by Hao Tan and Mohit Bansal.
 26. **[Funnel Transformer](https://github.com/laiguokun/Funnel-Transformer)** (from CMU/Google Brain) released with the paper [Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing](https://arxiv.org/abs/2006.03236) by Zihang Dai, Guokun Lai, Yiming Yang, Quoc V. Le.
-27. **[Other community models](https://huggingface.co/models)**, contributed by the [community](https://huggingface.co/users).
-28. Want to contribute a new model? We have added a **detailed guide and templates** to guide you in the process of adding a new model. You can find them in the [`templates`](./templates) folder of the repository. Be sure to check the [contributing guidelines](./CONTRIBUTING.md) and contact the maintainers or open an issue to collect feedbacks before starting your PR.
+27. **[LayoutLM](https://github.com/microsoft/unilm/tree/master/layoutlm)** (from Microsoft Research Asia) released with the paper [LayoutLM: Pre-training of Text and Layout for Document Image Understanding](https://arxiv.org/abs/1912.13318) by Yiheng Xu, Minghao Li, Lei Cui, Shaohan Huang, Furu Wei, Ming Zhou.
+28. **[Other community models](https://huggingface.co/models)**, contributed by the [community](https://huggingface.co/users).
+29. Want to contribute a new model? We have added a **detailed guide and templates** to guide you in the process of adding a new model. You can find them in the [`templates`](./templates) folder of the repository. Be sure to check the [contributing guidelines](./CONTRIBUTING.md) and contact the maintainers or open an issue to collect feedbacks before starting your PR.

 These implementations have been tested on several datasets (see the example scripts) and should match the performances of the original implementations. You can find more details on the performances in the Examples section of the [documentation](https://huggingface.co/transformers/examples.html).