Add GOT-OCR 2.0 to Transformers (#34721)
* init modular got_ocr2 * Get correct got_ocr architecture * add processing * run modular with processing * add working inference * apply modular * Refactor and fix style * Refactor, cleanup, fix style * fix init order * Fix docs * add base modeling tests * fix style and consistency * rename doc file * fix repo consistency * fix inference with box * add image processing and support for crop_to_multi_page * Fix batch inference * add tests * fixup * fix slow test * fix docstrings * Add model doc * update to new init * fix input autocast pixel_values dtype * update doc * move doc to multimodal * Reformat crop_image_to_patches and add docstrings * Fix example in forward docstring * Address Pablo review * [run slow] got_ocr2 * remove defaults defined twice * apply modular * add torch_device to integration tests * update modular * follow-up Pavel review * add device variable in doc * fix doc multi-page * Force eager attention for vision encoder to avoid attn implementation conflict * revert qwen2vl doc changes * use Qwen2ForCausalLM instead of Qwen2Model * make fixup * refactor gotocr2 to llava style * uniformize function names and reduce checks * final nits * fix pixel_values dtype error * change checkpoint names * fix modular
This commit is contained in:
@@ -161,6 +161,7 @@ Flax), PyTorch, and/or TensorFlow.
|
||||
| [GIT](model_doc/git) | ✅ | ❌ | ❌ |
|
||||
| [GLM](model_doc/glm) | ✅ | ❌ | ❌ |
|
||||
| [GLPN](model_doc/glpn) | ✅ | ❌ | ❌ |
|
||||
| [GOT-OCR2](model_doc/got_ocr2) | ✅ | ❌ | ❌ |
|
||||
| [GPT Neo](model_doc/gpt_neo) | ✅ | ❌ | ✅ |
|
||||
| [GPT NeoX](model_doc/gpt_neox) | ✅ | ❌ | ❌ |
|
||||
| [GPT NeoX Japanese](model_doc/gpt_neox_japanese) | ✅ | ❌ | ❌ |
|
||||
|
||||
Reference in New Issue
Block a user