Docs / Quantization: Replace all occurences of load_in_8bit with bnb config (#31136)

Replace all occurences of `load_in_8bit` with bnb config
This commit is contained in:
Younes Belkada
2024-05-30 16:47:35 +02:00
committed by GitHub
parent cda9c82a63
commit f5590deaa8
11 changed files with 37 additions and 37 deletions

View File

@@ -360,12 +360,12 @@ model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", load_i
```python
# pip install transformers accelerate bitsandbytes
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
model_id = "bigscience/bloom-1b7"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", load_in_8bit=True)
model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=BitsAndBytesConfig(load_in_8bit=True))
```
然后,像通常使用 `PreTrainedModel` 一样使用您的模型。
@@ -441,9 +441,9 @@ model_double_quant = AutoModelForCausalLM.from_pretrained(model_id, quantization
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
model = AutoModelForCausalLM.from_pretrained("bigscience/bloom-560m", device_map="auto", load_in_8bit=True)
model = AutoModelForCausalLM.from_pretrained("bigscience/bloom-560m", quantization_config=BitsAndBytesConfig(load_in_8bit=True))
tokenizer = AutoTokenizer.from_pretrained("bigscience/bloom-560m")
model.push_to_hub("bloom-560m-8bit")

View File

@@ -86,10 +86,10 @@ model.load_adapter(peft_model_id)
`bitsandbytes`集成支持8bit和4bit精度数据类型这对于加载大模型非常有用因为它可以节省内存请参阅`bitsandbytes`[指南](./quantization#bitsandbytes-integration)以了解更多信息)。要有效地将模型分配到您的硬件,请在[`~PreTrainedModel.from_pretrained`]中添加`load_in_8bit``load_in_4bit`参数,并将`device_map="auto"`设置为:
```py
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
peft_model_id = "ybelkada/opt-350m-lora"
model = AutoModelForCausalLM.from_pretrained(peft_model_id, device_map="auto", load_in_8bit=True)
model = AutoModelForCausalLM.from_pretrained(peft_model_id, quantization_config=BitsAndBytesConfig(load_in_8bit=True))
```
## 添加新的adapter