[Port] TensorFlow implementation of Mistral (#29708)
* chore: initial commit * chore: adding imports and inits * chore: adding the causal and classification code * chore: adding names to the layers * chore: using single self attn layer * chore: built the model and layers * chore: start with testing * chore: docstring change, transpose fix * fix: rotary embedding * chore: adding cache implementation * remove unused torch * chore: fixing the indexing issue * make fix-copies * Use modeling_tf_utils.keras * make fixup * chore: fixing tests * chore: adding past key value logic * chore: adding multi label classfication test * fix: switching on the built parameters in the layers * fixing repo consistency * ruff formats * style changes * fix: tf and pt equivalence * removing returns from docstrings * fix docstrings * fix docstrings * removing todos * fix copies * fix docstring * fix docstring * chore: using easier rotate_half * adding integration tests * chore: addressing review related to rotary embedding layer * review changes * [run-slow] mistral * skip: test save load after resize token embedding * style --------- Co-authored-by: Matt <rocketknight1@gmail.com>
This commit is contained in:
committed by
GitHub
parent
2a89673fe5
commit
965e98dc54
@@ -200,7 +200,7 @@ Flax), PyTorch, and/or TensorFlow.
|
||||
| [Megatron-BERT](model_doc/megatron-bert) | ✅ | ❌ | ❌ |
|
||||
| [Megatron-GPT2](model_doc/megatron_gpt2) | ✅ | ✅ | ✅ |
|
||||
| [MGP-STR](model_doc/mgp-str) | ✅ | ❌ | ❌ |
|
||||
| [Mistral](model_doc/mistral) | ✅ | ❌ | ✅ |
|
||||
| [Mistral](model_doc/mistral) | ✅ | ✅ | ✅ |
|
||||
| [Mixtral](model_doc/mixtral) | ✅ | ❌ | ❌ |
|
||||
| [mLUKE](model_doc/mluke) | ✅ | ❌ | ❌ |
|
||||
| [MMS](model_doc/mms) | ✅ | ✅ | ✅ |
|
||||
|
||||
@@ -216,4 +216,19 @@ A list of official Hugging Face and community (indicated by 🌎) resources to h
|
||||
## FlaxMistralForCausalLM
|
||||
|
||||
[[autodoc]] FlaxMistralForCausalLM
|
||||
- __call__
|
||||
- __call__
|
||||
|
||||
## TFMistralModel
|
||||
|
||||
[[autodoc]] TFMistralModel
|
||||
- call
|
||||
|
||||
## TFMistralForCausalLM
|
||||
|
||||
[[autodoc]] TFMistralForCausalLM
|
||||
- call
|
||||
|
||||
## TFMistralForSequenceClassification
|
||||
|
||||
[[autodoc]] TFMistralForSequenceClassification
|
||||
- call
|
||||
Reference in New Issue
Block a user