[VLMs] use only xxx_token_id for multimodal tokens (#37573)
* use only `xxx_token_id` for multimodal tokens * update modeling files as well * fixup * why fixup doesn't fix modular docstring first? * janus, need to update configs in the hub still * last fixup
This commit is contained in:
committed by
GitHub
parent
4afd3f4820
commit
2ba6b92a6f
@@ -224,12 +224,9 @@ class GenerationTesterMixin:
|
||||
# to crash. On pretrained models this isn't a risk, as they are trained to not generate these tokens.
|
||||
if config is not None:
|
||||
for key in [
|
||||
"image_token_index",
|
||||
"image_token_id",
|
||||
"video_token_index",
|
||||
"video_token_id",
|
||||
"vision_start_token_id",
|
||||
"audio_token_index",
|
||||
"audio_start_token_id",
|
||||
"audio_end_token_id",
|
||||
"vision_end_token_id",
|
||||
|
||||
Reference in New Issue
Block a user