Trainer - deprecate tokenizer for processing_class (#32385)

* Trainer - deprecate tokenizer for processing_class * Extend chage across Seq2Seq trainer and docs * Add tests * Update to FutureWarning and add deprecation version
2024-10-02 14:08:46 +01:00
parent e7c8af7f33
commit b7474f211d
99 changed files with 569 additions and 442 deletions
--- a/docs/source/en/model_doc/mamba.md
+++ b/docs/source/en/model_doc/mamba.md
@@ -39,8 +39,8 @@ The original code can be found [here](https://github.com/state-spaces/mamba).

 # Usage

-### A simple generation example: 
-```python 
+### A simple generation example:
+```python
 from transformers import MambaConfig, MambaForCausalLM, AutoTokenizer
 import torch

@@ -55,7 +55,7 @@ print(tokenizer.batch_decode(out))
 ### Peft finetuning
 The slow version is not very stable for training, and the fast one needs `float32`!

-```python 
+```python
 from datasets import load_dataset
 from trl import SFTTrainer
 from peft import LoraConfig
@@ -80,7 +80,7 @@ lora_config =  LoraConfig(
 )
 trainer = SFTTrainer(
    model=model,
-    tokenizer=tokenizer,
+    processing_class=tokenizer,
    args=training_args,
    peft_config=lora_config,
    train_dataset=dataset,