[`PixtralLarge`] Update Pixtral conversion script to support large format! (#34801)

* update conversion script

* update for bias again

* remove pdv

* use my dir

* Update how we initialize the tokenizer

* Convert in bfloat16

* Undo that one again

* fix config dump

* .to() was broken for BatchMixFeature

* quick debug breakpoint

* put the breakpoint in the right place

* Add a config flag for the multimodal projector bias

* Add a config flag for the multimodal projector bias

* Conversion script can load chat templates

* Indent config for comparison

* Stop clobbering the config

* Re-enable the config clobber

* Get rid of the config manual save - it has no effect!

* Handle adapter bias correctly

* Default vision transformer activation to silu

* Remove legacy processing path

* One commit with all the debug breakpoints before I delete them all, in case I need to revert

* Update conversion

* Remove vLLM debugging instrumentation

* Drop xformers

* Remove debug enumerates

* make fixup

* make fixup

* Break copied from in pixtral

* Propagate multimodal_projector_bias change

* Propagate multimodal_projector_bias change

* Remove debug device .to()

* Restore attention weights output

* Fix Pixtral test

* Drop image_seq_length

* Drop image_seq_length

* Put the legacy processing code back

* Add the bias option to the llava_next_video config

* Add the bias option to the llava_next_video config

* Make certain args required in converter

* Make certain args required in converter

* typo

* make fixup

* Reverting some dtype changes since it seems to work without them

---------

Co-authored-by: arthur@huggingface.co <arthur@ip-26-0-166-244.ec2.internal>
Co-authored-by: Matt <rocketknight1@gmail.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

This commit is contained in:

Arthur

2025-01-08 17:39:47 +01:00

committed by

GitHub

parent 4c2c12b3de

commit 3f483beab9

16 changed files with 199 additions and 114 deletions

									
										2

tests/models/pixtral/test_processor_pixtral.py
									
												View File
												
				@@ -253,7 +253,7 @@ class PixtralProcessorTest(ProcessorTesterMixin, unittest.TestCase):

				            "USER: [IMG]\nWhat's the content of the image? ASSISTANT:",

				        ] * 5

				        processor.tokenizer.pad_token = "</s>"

				        image_inputs = [self.image_0] * 5

				        image_inputs = [[self.image_0]] * 5

				        # Make small for checking image token expansion

				        processor.image_processor.size = {"longest_edge": 30}

[PixtralLarge] Update Pixtral conversion script to support large format! (#34801)

2 tests/models/pixtral/test_processor_pixtral.py Unescape Escape View File

[`PixtralLarge`] Update Pixtral conversion script to support large format! (#34801)

2

tests/models/pixtral/test_processor_pixtral.py

View File