Add Watermarking LogitsProcessor and WatermarkDetector (#29676)
* add watermarking processor * remove the other hashing (context width=1 always) * make style * Update src/transformers/generation/logits_process.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/logits_process.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/logits_process.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/configuration_utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * update watermarking process * add detector * update tests to use detector * fix failing tests * rename `input_seq` * make style * doc for processor * minor fixes * docs * make quality * Update src/transformers/generation/configuration_utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/logits_process.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/watermarking.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/watermarking.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/watermarking.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * add PR suggestions * let's use lru_cache's default max size (128) * import processor if torch available * maybe like this * lets move the config to torch independet file * add docs * tiny docs fix to make the test happy * Update src/transformers/generation/configuration_utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/watermarking.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * PR suggestions * add docs * fix test * fix docs * address pr comments * style * Revert "style" This reverts commit 7f33cc34ff08b414f8e7f90060889877606b43b2. * correct style * make doctest green --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
This commit is contained in:
committed by
GitHub
parent
65ea1904ff
commit
5ad960f1f4
@@ -173,6 +173,55 @@ your screen, one word at a time:
|
||||
An increasing sequence: one, two, three, four, five, six, seven, eight, nine, ten, eleven,
|
||||
```
|
||||
|
||||
|
||||
## Watermarking
|
||||
|
||||
The `generate()` supports watermarking the generated text by randomly marking a portion of tokens as "green".
|
||||
When generating the "green" will have a small 'bias' value added to their logits, thus having a higher chance to be generated.
|
||||
The watermarked text can be detected by calculating the proportion of "green" tokens in the text and estimating how likely it is
|
||||
statistically to obtain that amount of "green" tokens for human-generated text. This watermarking strategy was proposed in the paper
|
||||
["On the Reliability of Watermarks for Large Language Models"](https://arxiv.org/abs/2306.04634). For more information on
|
||||
the inner functioning of watermarking, it is recommended to refer to the paper.
|
||||
|
||||
The watermarking can be used with any generative model in `tranformers` and does not require an extra classification model
|
||||
to detect watermarked text. To trigger watermarking, pass in a [`WatermarkingConfig`] with needed arguments directly to the
|
||||
`.generate()` method or add it to the [`GenerationConfig`]. Watermarked text can be later detected with a [`WatermarkDetector`].
|
||||
|
||||
|
||||
<Tip warning={true}>
|
||||
|
||||
The WatermarkDetector internally relies on the proportion of "green" tokens, and whether generated text follows the coloring pattern.
|
||||
That is why it is recommended to strip off the prompt text, if it is much longer than the generated text.
|
||||
This also can have an effect when one sequence in the batch is a lot longer causing other rows to be padded.
|
||||
Additionally, the detector **must** be initiated with identical watermark configuration arguments used when generating.
|
||||
|
||||
</Tip>
|
||||
|
||||
Let's generate some text with watermarking. In the below code snippet, we set the bias to 2.5 which is a value that
|
||||
will be added to "green" tokens' logits. After generating watermarked text, we can pass it directly to the `WatermarkDetector`
|
||||
to check if the text is machine-generated (outputs `True` for machine-generated and `False` otherwise).
|
||||
|
||||
```python
|
||||
>>> from transformers import AutoTokenizer, AutoModelForCausalLM, WatermarkDetector, WatermarkingConfig
|
||||
|
||||
>>> model = AutoModelForCausalLM.from_pretrained("openai-community/gpt2")
|
||||
>>> tok = AutoTokenizer.from_pretrained("openai-community/gpt2")
|
||||
>>> tok.pad_token_id = tok.eos_token_id
|
||||
>>> tok.padding_side = "left"
|
||||
|
||||
>>> inputs = tok(["This is the beginning of a long story", "Alice and Bob are"], padding=True, return_tensors="pt")
|
||||
>>> input_len = inputs["input_ids"].shape[-1]
|
||||
|
||||
>>> watermarking_config = WatermarkingConfig(bias=2.5, seeding_scheme="selfhash")
|
||||
>>> out = model.generate(**inputs, watermarking_config=watermarking_config, do_sample=False, max_length=20)
|
||||
|
||||
>>> detector = WatermarkDetector(model_config=model.config, device="cpu", watermarking_config=watermarking_config)
|
||||
>>> detection_out = detector(out, return_dict=True)
|
||||
>>> detection_out.prediction
|
||||
array([True, True])
|
||||
```
|
||||
|
||||
|
||||
## Decoding strategies
|
||||
|
||||
Certain combinations of the `generate()` parameters, and ultimately `generation_config`, can be used to enable specific
|
||||
|
||||
@@ -209,6 +209,10 @@ generation.
|
||||
[[autodoc]] WhisperTimeStampLogitsProcessor
|
||||
- __call__
|
||||
|
||||
[[autodoc]] WatermarkLogitsProcessor
|
||||
- __call__
|
||||
|
||||
|
||||
### TensorFlow
|
||||
|
||||
[[autodoc]] TFForcedBOSTokenLogitsProcessor
|
||||
@@ -372,3 +376,10 @@ A [`Constraint`] can be used to force the generation to include specific tokens
|
||||
- update
|
||||
- get_seq_length
|
||||
- reorder_cache
|
||||
|
||||
|
||||
## Watermark Utils
|
||||
|
||||
[[autodoc]] WatermarkDetector
|
||||
- __call__
|
||||
|
||||
|
||||
@@ -41,6 +41,8 @@ like token streaming.
|
||||
- validate
|
||||
- get_generation_mode
|
||||
|
||||
[[autodoc]] generation.WatermarkingConfig
|
||||
|
||||
## GenerationMixin
|
||||
|
||||
[[autodoc]] generation.GenerationMixin
|
||||
|
||||
Reference in New Issue
Block a user