* In assisted decoding, pass model_kwargs to model's forward call
Previously, assisted decoding would ignore any additional kwargs
that it doesn't explicitly handle. This was inconsistent with other
generation methods, which pass the model_kwargs through
prepare_inputs_for_generation and forward the returned dict to the
model's forward call.
The prepare_inputs_for_generation method needs to be amended in all
models, as previously it only kept the last input ID when a past_key_values
was passed.
* Improve variable names in _extend_attention_mask
* Refactor extending token_type_ids into a function
* Replace deepcopy with copy to optimize performance
* Update new persimmon model with llama changes for assisted generation
* Update new mistral model for assisted generation with prepare_inputs_for_generation
* Update position_ids creation in falcon prepare_inputs_for_generation to support assisted generation
* Fix issues in test_exponential_decay_length_penalty
Fix tests which were broken and add validation of negative scores.
Current test didn't take into account that ExponentialDecayLengthPenalty updates the score inplace, resulting in updates to base tested Tensor.
In addition, the gt assert had empty Tensors due to indexing along the batch dimension.
Test is currently expected to fail to show ExponentialDecayLengthPenalty issues with negative scores
* Fix ExponentialDecayLengthPenalty negative logits issue
In cases where the scores are negative, ExponentialDecayLengthPenalty decreases the score of eos_token_id instead of increasing it.
To fix this issue we compute the penalty of the absolute value and add it to the original score.
* Add examples for ExponentialDecayLengthPenalty
* Fix styling issue in ExponentialDecayLengthPenalty doc
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Style and quality fix
* Fix example outputs
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Fix TypeError: Object of type int64 is not JSON serializable
* Convert numpy.float64 and numpy.int64 to float and int for json serialization
* Black reformatted examples/pytorch/token-classification/run_ner_no_trainer.py
* * make style
* Replace python random with torch.rand to enable dynamo.export
* revert changes to flax model code
* Remove unused random import
* Fix torch template
* Move torch.manual_seed(0) to right location
* Rework TF type hints to use | None instead of Optional[] for tf.Tensor
* Rework TF type hints to use | None instead of Optional[] for tf.Tensor
* Don't forget the imports
* Add the imports to tests too
* make fixup
* Refactor tests that depended on get_type_hints
* Better test refactor
* Fix an old hidden bug in the test_keras_fit input creation code
* Fix for the Deit tests
* time to say goodbye, torch 1.7 and 1.8
* clean up torch_int_div
* clean up is_torch_less_than_1_8-9
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* rounding_mode = "floor" instead of // to prevent behavioral change
* add other TODO
* use `torch_int_div` from pytrch_utils
* same for tests
* fix copies
* style
* use relative imports when needed
* Co-authored-by: sgugger <sylvain.gugger@gmail.com>