Transformers cli clean command (#37657)

* transformers-cli -> transformers

* Chat command works with positional argument

* update doc references to transformers-cli

* doc headers

* deepspeed

---------

Co-authored-by: Joao Gante <joao@huggingface.co>
This commit is contained in:
Lysandre Debut
2025-04-30 13:15:43 +02:00
committed by GitHub
parent 63cd4c76f3
commit d538293f62
49 changed files with 399 additions and 386 deletions

View File

@@ -35,7 +35,7 @@ The example below demonstrates how to generate code with [`Pipeline`], or the [`
<hfoptions id="usage">
<hfoption id="Pipeline">
```py
import torch
from transformers import pipeline
@@ -76,7 +76,7 @@ prompt = "# Function to calculate the factorial of a number\ndef factorial(n):"
input_ids = tokenizer(prompt, return_tensors="pt").to("cuda")
output = model.generate(
**input_ids,
**input_ids,
max_new_tokens=256,
cache_implementation="static"
)
@@ -92,10 +92,10 @@ print(filled_text)
```
</hfoption>
<hfoption id="transformers-cli">
<hfoption id="transformers CLI">
```bash
echo -e "# Function to calculate the factorial of a number\ndef factorial(n):" | transformers-cli run --task text-generation --model meta-llama/CodeLlama-7b-hf --device 0
echo -e "# Function to calculate the factorial of a number\ndef factorial(n):" | transformers run --task text-generation --model meta-llama/CodeLlama-7b-hf --device 0
```
</hfoption>
@@ -146,7 +146,7 @@ visualizer("""def func(a, b):
- Use the `<FILL_ME>` token where you want your input to be filled. The tokenizer splits this token to create a formatted input string that follows the [original training pattern](https://github.com/facebookresearch/codellama/blob/cb51c14ec761370ba2e2bc351374a79265d0465e/llama/generation.py#L402). This is more robust than preparing the pattern yourself.
```py
from transformers import LlamaForCausalLM, CodeLlamaTokenizer
tokenizer = CodeLlamaTokenizer.from_pretrained("meta-llama/CodeLlama-7b-hf")
model = LlamaForCausalLM.from_pretrained("meta-llama/CodeLlama-7b-hf")
PROMPT = '''def remove_non_ascii(s: str) -> str:
@@ -155,7 +155,7 @@ visualizer("""def func(a, b):
'''
input_ids = tokenizer(PROMPT, return_tensors="pt")["input_ids"]
generated_ids = model.generate(input_ids, max_new_tokens=128)
filling = tokenizer.batch_decode(generated_ids[:, input_ids.shape[1]:], skip_special_tokens = True)[0]
print(PROMPT.replace("<FILL_ME>", filling))
```