fix multi-image case for llava-onevision (#38084)
* _get_padding_size module * do not patchify images when processing multi image * modify llava onevision image processor fast * tensor to list of tensors * backward compat * reuse pad_to_square in llave & some clarification * add to doc * fix: consider no image cases (text only or video) * add integration test * style & repo_consistency
This commit is contained in:
@@ -233,7 +233,7 @@ class ImageProcessingTestMixin:
|
||||
avg_time = sum(sorted(all_times[:3])) / 3.0
|
||||
return avg_time
|
||||
|
||||
dummy_images = torch.randint(0, 255, (4, 3, 224, 224), dtype=torch.uint8)
|
||||
dummy_images = [torch.randint(0, 255, (3, 224, 224), dtype=torch.uint8) for _ in range(4)]
|
||||
image_processor_slow = self.image_processing_class(**self.image_processor_dict)
|
||||
image_processor_fast = self.fast_image_processing_class(**self.image_processor_dict)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user