[Data2Vec] Add data2vec vision (#16760)
* save intermediate * add vision * add vision * save * finish models * finish models * continue * finish * up * up * up * tests all pass * clean up * up * up * fix bugs in beit * correct docs * finish * finish docs * make style * up * more fixes * fix type hint * make style * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/data2vec/test_modeling_data2vec_vision.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix test Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
This commit is contained in:
committed by
GitHub
parent
33cd4be576
commit
8d3f952adb
@@ -190,6 +190,7 @@ Flax), PyTorch, and/or TensorFlow.
|
||||
| CTRL | ✅ | ❌ | ✅ | ✅ | ❌ |
|
||||
| Data2VecAudio | ❌ | ❌ | ✅ | ❌ | ❌ |
|
||||
| Data2VecText | ❌ | ❌ | ✅ | ❌ | ❌ |
|
||||
| Data2VecVision | ❌ | ❌ | ✅ | ❌ | ❌ |
|
||||
| DeBERTa | ✅ | ✅ | ✅ | ✅ | ❌ |
|
||||
| DeBERTa-v2 | ✅ | ❌ | ✅ | ✅ | ❌ |
|
||||
| Decision Transformer | ❌ | ❌ | ✅ | ❌ | ❌ |
|
||||
|
||||
@@ -33,10 +33,13 @@ Models and code are available at www.github.com/pytorch/fairseq/tree/master/exam
|
||||
|
||||
Tips:
|
||||
|
||||
- Both Data2VecAudio and Data2VecText have been trained using the same self-supervised learning method.
|
||||
In the case of Data2VecAudio, preprocessing is identical to [`RobertaModel`], including tokenization.
|
||||
- Data2VecAudio, Data2VecText, and Data2VecVision have all been trained using the same self-supervised learning method.
|
||||
- For Data2VecAudio, preprocessing is identical to [`Wav2Vec2Model`], including feature extraction
|
||||
- For Data2VecText, preprocessing is identical to [`RobertaModel`], including tokenization.
|
||||
- For Data2VecVision, preprocessing is identical to [`BeitModel`], including feature extraction.
|
||||
|
||||
This model was contributed by [edugp](https://huggingface.co/edugp) and [patrickvonplaten](https://huggingface.co/patrickvonplaten)
|
||||
|
||||
This model was contributed by [edugp](https://huggingface.co/edugp).
|
||||
The original code can be found [here](https://github.com/pytorch/fairseq/tree/main/examples/data2vec).
|
||||
|
||||
|
||||
@@ -48,12 +51,16 @@ The original code can be found [here](https://github.com/pytorch/fairseq/tree/ma
|
||||
|
||||
[[autodoc]] Data2VecAudioConfig
|
||||
|
||||
## Data2VecVisionConfig
|
||||
|
||||
[[autodoc]] Data2VecVisionConfig
|
||||
|
||||
|
||||
## Data2VecAudioModel
|
||||
|
||||
[[autodoc]] Data2VecAudioModel
|
||||
- forward
|
||||
|
||||
|
||||
## Data2VecAudioForAudioFrameClassification
|
||||
|
||||
[[autodoc]] Data2VecAudioForAudioFrameClassification
|
||||
@@ -108,3 +115,18 @@ The original code can be found [here](https://github.com/pytorch/fairseq/tree/ma
|
||||
|
||||
[[autodoc]] Data2VecTextForQuestionAnswering
|
||||
- forward
|
||||
|
||||
## Data2VecVisionModel
|
||||
|
||||
[[autodoc]] Data2VecVisionModel
|
||||
- forward
|
||||
|
||||
## Data2VecVisionForImageClassification
|
||||
|
||||
[[autodoc]] Data2VecVisionForImageClassification
|
||||
- forward
|
||||
|
||||
## Data2VecVisionForSemanticSegmentation
|
||||
|
||||
[[autodoc]] Data2VecVisionForSemanticSegmentation
|
||||
- forward
|
||||
|
||||
@@ -54,6 +54,7 @@ Ready-made configurations include the following architectures:
|
||||
- BlenderbotSmall
|
||||
- CamemBERT
|
||||
- Data2VecText
|
||||
- Data2VecVision
|
||||
- DistilBERT
|
||||
- ELECTRA
|
||||
- FlauBERT
|
||||
|
||||
Reference in New Issue
Block a user