Add TokenClassification for Mistral, Mixtral and Qwen2 (#29878)
* Add MistralForTokenClassification * Add tests and docs * Add token classification for Mixtral and Qwen2 * Save llma for token classification draft * Add token classification support for Llama, Gemma, Persimmon, StableLm and StarCoder2 * Formatting * Add token classification support for Qwen2Moe model * Add dropout layer to each ForTokenClassification model * Add copied from in tests * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Propagate suggested changes * Style --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
This commit is contained in:
@@ -60,6 +60,11 @@ This model was contributed by [Arthur Zucker](https://huggingface.co/ArthurZ), [
|
||||
[[autodoc]] GemmaForSequenceClassification
|
||||
- forward
|
||||
|
||||
## GemmaForTokenClassification
|
||||
|
||||
[[autodoc]] GemmaForTokenClassification
|
||||
- forward
|
||||
|
||||
## FlaxGemmaModel
|
||||
|
||||
[[autodoc]] FlaxGemmaModel
|
||||
|
||||
@@ -121,6 +121,11 @@ A list of official Hugging Face and community (indicated by 🌎) resources to h
|
||||
[[autodoc]] LlamaForQuestionAnswering
|
||||
- forward
|
||||
|
||||
## LlamaForTokenClassification
|
||||
|
||||
[[autodoc]] LlamaForTokenClassification
|
||||
- forward
|
||||
|
||||
## FlaxLlamaModel
|
||||
|
||||
[[autodoc]] FlaxLlamaModel
|
||||
|
||||
@@ -203,6 +203,11 @@ A list of official Hugging Face and community (indicated by 🌎) resources to h
|
||||
[[autodoc]] MistralForSequenceClassification
|
||||
- forward
|
||||
|
||||
## MistralForTokenClassification
|
||||
|
||||
[[autodoc]] MistralForTokenClassification
|
||||
- forward
|
||||
|
||||
## FlaxMistralModel
|
||||
|
||||
[[autodoc]] FlaxMistralModel
|
||||
|
||||
@@ -204,3 +204,8 @@ A list of official Hugging Face and community (indicated by 🌎) resources to h
|
||||
|
||||
[[autodoc]] MixtralForSequenceClassification
|
||||
- forward
|
||||
|
||||
## MixtralForTokenClassification
|
||||
|
||||
[[autodoc]] MixtralForTokenClassification
|
||||
- forward
|
||||
|
||||
@@ -96,3 +96,8 @@ The `LlamaTokenizer` is used as it is a standard wrapper around sentencepiece. T
|
||||
|
||||
[[autodoc]] PersimmonForSequenceClassification
|
||||
- forward
|
||||
|
||||
## PersimmonForTokenClassification
|
||||
|
||||
[[autodoc]] PersimmonForTokenClassification
|
||||
- forward
|
||||
|
||||
@@ -80,3 +80,8 @@ In the following, we demonstrate how to use `Qwen2-7B-Chat-beta` for the inferen
|
||||
|
||||
[[autodoc]] Qwen2ForSequenceClassification
|
||||
- forward
|
||||
|
||||
## Qwen2ForTokenClassification
|
||||
|
||||
[[autodoc]] Qwen2ForTokenClassification
|
||||
- forward
|
||||
|
||||
@@ -75,3 +75,8 @@ In the following, we demonstrate how to use `Qwen1.5-MoE-A2.7B-Chat` for the inf
|
||||
|
||||
[[autodoc]] Qwen2MoeForSequenceClassification
|
||||
- forward
|
||||
|
||||
## Qwen2MoeForTokenClassification
|
||||
|
||||
[[autodoc]] Qwen2MoeForTokenClassification
|
||||
- forward
|
||||
|
||||
@@ -104,3 +104,8 @@ Now, to run the model with Flash Attention 2, refer to the snippet below:
|
||||
|
||||
[[autodoc]] StableLmForSequenceClassification
|
||||
- forward
|
||||
|
||||
## StableLmForTokenClassification
|
||||
|
||||
[[autodoc]] StableLmForTokenClassification
|
||||
- forward
|
||||
|
||||
@@ -66,3 +66,8 @@ These ready-to-use checkpoints can be downloaded and used via the HuggingFace Hu
|
||||
|
||||
[[autodoc]] Starcoder2ForSequenceClassification
|
||||
- forward
|
||||
|
||||
## Starcoder2ForTokenClassification
|
||||
|
||||
[[autodoc]] Starcoder2ForTokenClassification
|
||||
- forward
|
||||
|
||||
Reference in New Issue
Block a user