More fixes for doctest (#30265)
* fix * update * update * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
This commit is contained in:
@@ -97,6 +97,7 @@ If you only want the infilled part:
|
||||
|
||||
>>> generator = pipeline("text-generation",model="codellama/CodeLlama-7b-hf",torch_dtype=torch.float16, device_map="auto")
|
||||
>>> generator('def remove_non_ascii(s: str) -> str:\n """ <FILL_ME>\n return result', max_new_tokens = 128)
|
||||
[{'generated_text': 'def remove_non_ascii(s: str) -> str:\n """ <FILL_ME>\n return resultRemove non-ASCII characters from a string. """\n result = ""\n for c in s:\n if ord(c) < 128:\n result += c'}]
|
||||
```
|
||||
|
||||
Under the hood, the tokenizer [automatically splits by `<FILL_ME>`](https://huggingface.co/docs/transformers/main/model_doc/code_llama#transformers.CodeLlamaTokenizer.fill_token) to create a formatted input string that follows [the original training pattern](https://github.com/facebookresearch/codellama/blob/cb51c14ec761370ba2e2bc351374a79265d0465e/llama/generation.py#L402). This is more robust than preparing the pattern yourself: it avoids pitfalls, such as token glueing, that are very hard to debug. To see how much CPU and GPU memory you need for this model or others, try [this calculator](https://huggingface.co/spaces/hf-accelerate/model-memory-usage) which can help determine that value.
|
||||
|
||||
@@ -270,11 +270,13 @@ For example, if you use this [invoice image](https://huggingface.co/spaces/impir
|
||||
>>> from transformers import pipeline
|
||||
|
||||
>>> vqa = pipeline(model="impira/layoutlm-document-qa")
|
||||
>>> vqa(
|
||||
>>> output = vqa(
|
||||
... image="https://huggingface.co/spaces/impira/docquery/resolve/2359223c1837a7587402bda0f2643382a6eefeab/invoice.png",
|
||||
... question="What is the invoice number?",
|
||||
... )
|
||||
[{'score': 0.42515, 'answer': 'us-001', 'start': 16, 'end': 16}]
|
||||
>>> output[0]["score"] = round(output[0]["score"], 3)
|
||||
>>> output
|
||||
[{'score': 0.425, 'answer': 'us-001', 'start': 16, 'end': 16}]
|
||||
```
|
||||
|
||||
<Tip>
|
||||
|
||||
@@ -326,7 +326,7 @@ Document question answering is a task that answers natural language questions fr
|
||||
>>> from PIL import Image
|
||||
>>> import requests
|
||||
|
||||
>>> url = "https://datasets-server.huggingface.co/assets/hf-internal-testing/example-documents/--/hf-internal-testing--example-documents/test/2/image/image.jpg"
|
||||
>>> url = "https://huggingface.co/datasets/hf-internal-testing/example-documents/resolve/main/jpeg_images/2.jpg"
|
||||
>>> image = Image.open(requests.get(url, stream=True).raw)
|
||||
|
||||
>>> doc_question_answerer = pipeline("document-question-answering", model="magorshunov/layoutlm-invoices")
|
||||
|
||||
Reference in New Issue
Block a user