No more Tuple, List, Dict (#38797)

* No more Tuple, List, Dict

* make fixup

* More style fixes

* Docstring fixes with regex replacement

* Trigger tests

* Redo fixes after rebase

* Fix copies

* [test all]

* update

* [test all]

* update

* [test all]

* make style after rebase

* Patch the hf_argparser test

* Patch the hf_argparser test

* style fixes

* style fixes

* style fixes

* Fix docstrings in Cohere test

* [test all]

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
This commit is contained in:
Matt
2025-06-17 19:37:18 +01:00
committed by GitHub
parent a396f4324b
commit 508a704055
1291 changed files with 14906 additions and 14941 deletions

View File

@@ -170,7 +170,7 @@ Unlike other data collators, this specific data collator needs to apply a differ
... processor: AutoProcessor
... padding: Union[bool, str] = "longest"
... def __call__(self, features: List[Dict[str, Union[List[int], torch.Tensor]]]) -> Dict[str, torch.Tensor]:
... def __call__(self, features: list[dict[str, Union[list[int], torch.Tensor]]]) -> dict[str, torch.Tensor]:
... # split inputs and labels since they have to be of different lengths and need
... # different padding methods
... input_features = [{"input_values": feature["input_values"][0]} for feature in features]

View File

@@ -243,7 +243,7 @@ and it uses the exact same dataset as an example. Apply some geometric and color
... )
```
The `image_processor` expects the annotations to be in the following format: `{'image_id': int, 'annotations': List[Dict]}`,
The `image_processor` expects the annotations to be in the following format: `{'image_id': int, 'annotations': list[Dict]}`,
where each dictionary is a COCO object annotation. Let's add a function to reformat annotations for a single example:
```py
@@ -252,9 +252,9 @@ The `image_processor` expects the annotations to be in the following format: `{'
... Args:
... image_id (str): image id. e.g. "0001"
... categories (List[int]): list of categories/class labels corresponding to provided bounding boxes
... areas (List[float]): list of corresponding areas to provided bounding boxes
... bboxes (List[Tuple[float]]): list of bounding boxes provided in COCO format
... categories (list[int]): list of categories/class labels corresponding to provided bounding boxes
... areas (list[float]): list of corresponding areas to provided bounding boxes
... bboxes (list[tuple[float]]): list of bounding boxes provided in COCO format
... ([center_x, center_y, width, height] in absolute coordinates)
... Returns:
@@ -397,7 +397,7 @@ Intermediate format of boxes used for training is `YOLO` (normalized) but we wil
... Args:
... boxes (torch.Tensor): Bounding boxes in YOLO format
... image_size (Tuple[int, int]): Image size in format (height, width)
... image_size (tuple[int, int]): Image size in format (height, width)
... Returns:
... torch.Tensor: Bounding boxes in Pascal VOC format (x_min, y_min, x_max, y_max)

View File

@@ -408,7 +408,7 @@ instructs the model to ignore that part of the spectrogram when calculating the
... class TTSDataCollatorWithPadding:
... processor: Any
... def __call__(self, features: List[Dict[str, Union[List[int], torch.Tensor]]]) -> Dict[str, torch.Tensor]:
... def __call__(self, features: list[dict[str, Union[list[int], torch.Tensor]]]) -> dict[str, torch.Tensor]:
... input_ids = [{"input_ids": feature["input_ids"]} for feature in features]
... label_features = [{"input_values": feature["labels"]} for feature in features]
... speaker_features = [feature["speaker_embeddings"] for feature in features]