Deberta tf (#12972)

* TFDeberta

moved weights to build and fixed name scope

added missing ,

bug fixes to enable graph mode execution

updated setup.py

fixing typo

fix imports

embedding mask fix

added layer names avoid autmatic incremental names

+XSoftmax

cleanup

added names to layer

disable keras_serializable
Distangled attention output shape hidden_size==None
using symbolic inputs

test for Deberta tf

make style

Update src/transformers/models/deberta/modeling_tf_deberta.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Update src/transformers/models/deberta/modeling_tf_deberta.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Update src/transformers/models/deberta/modeling_tf_deberta.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Update src/transformers/models/deberta/modeling_tf_deberta.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Update src/transformers/models/deberta/modeling_tf_deberta.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Update src/transformers/models/deberta/modeling_tf_deberta.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Update src/transformers/models/deberta/modeling_tf_deberta.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

removed tensorflow-probability

removed blank line

* removed tf experimental api
+torch_gather tf implementation from @Rocketknight1

* layername DeBERTa --> deberta

* copyright fix

* added docs for TFDeberta & make style

* layer_name change to fix load from pt model

* layer_name change as pt model

* SequenceClassification layername change,
to same as pt model

* switched to keras built-in LayerNormalization

* added `TFDeberta` prefix most layer classes

* updated to tf.Tensor in the docstring
This commit is contained in:
Kamal Raj
2021-08-12 14:31:26 +05:30
committed by GitHub
parent c4e1586db8
commit d329b63369
8 changed files with 1988 additions and 3 deletions

View File

@@ -343,7 +343,7 @@ Flax), PyTorch, and/or TensorFlow.
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
| DPR | ✅ | ✅ | ✅ | ✅ | ❌ |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
| DeBERTa | ✅ | ✅ | ✅ | | ❌ |
| DeBERTa | ✅ | ✅ | ✅ | | ❌ |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
| DeBERTa-v2 | ✅ | ❌ | ✅ | ❌ | ❌ |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+