Add TokenClassification for Mistral, Mixtral and Qwen2 (#29878)

* Add MistralForTokenClassification

* Add tests and docs

* Add token classification for Mixtral and Qwen2

* Save llma for token classification draft

* Add token classification support for Llama, Gemma, Persimmon, StableLm and StarCoder2

* Formatting

* Add token classification support for Qwen2Moe model

* Add dropout layer to each ForTokenClassification model

* Add copied from in tests

* Update src/transformers/models/llama/modeling_llama.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Propagate suggested changes

* Style

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
This commit is contained in:
Joseph Enguehard
2024-05-20 09:06:57 +01:00
committed by GitHub
parent 481a957814
commit 07bf2dff78
39 changed files with 1174 additions and 19 deletions

View File

@@ -60,6 +60,11 @@ This model was contributed by [Arthur Zucker](https://huggingface.co/ArthurZ), [
[[autodoc]] GemmaForSequenceClassification
- forward
## GemmaForTokenClassification
[[autodoc]] GemmaForTokenClassification
- forward
## FlaxGemmaModel
[[autodoc]] FlaxGemmaModel

View File

@@ -121,6 +121,11 @@ A list of official Hugging Face and community (indicated by 🌎) resources to h
[[autodoc]] LlamaForQuestionAnswering
- forward
## LlamaForTokenClassification
[[autodoc]] LlamaForTokenClassification
- forward
## FlaxLlamaModel
[[autodoc]] FlaxLlamaModel

View File

@@ -203,6 +203,11 @@ A list of official Hugging Face and community (indicated by 🌎) resources to h
[[autodoc]] MistralForSequenceClassification
- forward
## MistralForTokenClassification
[[autodoc]] MistralForTokenClassification
- forward
## FlaxMistralModel
[[autodoc]] FlaxMistralModel

View File

@@ -204,3 +204,8 @@ A list of official Hugging Face and community (indicated by 🌎) resources to h
[[autodoc]] MixtralForSequenceClassification
- forward
## MixtralForTokenClassification
[[autodoc]] MixtralForTokenClassification
- forward

View File

@@ -96,3 +96,8 @@ The `LlamaTokenizer` is used as it is a standard wrapper around sentencepiece. T
[[autodoc]] PersimmonForSequenceClassification
- forward
## PersimmonForTokenClassification
[[autodoc]] PersimmonForTokenClassification
- forward

View File

@@ -80,3 +80,8 @@ In the following, we demonstrate how to use `Qwen2-7B-Chat-beta` for the inferen
[[autodoc]] Qwen2ForSequenceClassification
- forward
## Qwen2ForTokenClassification
[[autodoc]] Qwen2ForTokenClassification
- forward

View File

@@ -75,3 +75,8 @@ In the following, we demonstrate how to use `Qwen1.5-MoE-A2.7B-Chat` for the inf
[[autodoc]] Qwen2MoeForSequenceClassification
- forward
## Qwen2MoeForTokenClassification
[[autodoc]] Qwen2MoeForTokenClassification
- forward

View File

@@ -104,3 +104,8 @@ Now, to run the model with Flash Attention 2, refer to the snippet below:
[[autodoc]] StableLmForSequenceClassification
- forward
## StableLmForTokenClassification
[[autodoc]] StableLmForTokenClassification
- forward

View File

@@ -66,3 +66,8 @@ These ready-to-use checkpoints can be downloaded and used via the HuggingFace Hu
[[autodoc]] Starcoder2ForSequenceClassification
- forward
## Starcoder2ForTokenClassification
[[autodoc]] Starcoder2ForTokenClassification
- forward