diff --git a/docs/source/en/tasks/object_detection.md b/docs/source/en/tasks/object_detection.md index 6f53655e7c..2513591f54 100644 --- a/docs/source/en/tasks/object_detection.md +++ b/docs/source/en/tasks/object_detection.md @@ -139,6 +139,17 @@ To get an even better understanding of the data, visualize an example in the dat ... box = annotations["bbox"][i] ... class_idx = annotations["category"][i] ... x, y, w, h = tuple(box) +... # Check if coordinates are normalized or not +... if max(box) > 1.0: +... # Coordinates are un-normalized, no need to re-scale them +... x1, y1 = int(x), int(y) +... x2, y2 = int(x + w), int(y + h) +... else: +... # Coordinates are normalized, re-scale them +... x1 = int(x * width) +... y1 = int(y * height) +... x2 = int((x + w) * width) +... y2 = int((y + h) * height) ... draw.rectangle((x, y, x + w, y + h), outline="red", width=1) ... draw.text((x, y), id2label[class_idx], fill="white") @@ -153,7 +164,7 @@ To visualize the bounding boxes with associated labels, you can get the labels f the `category` field. You'll also want to create dictionaries that map a label id to a label class (`id2label`) and the other way around (`label2id`). You can use them later when setting up the model. Including these maps will make your model reusable by others if you share -it on the Hugging Face Hub. +it on the Hugging Face Hub. Please note that, the part of above code that draws the bounding boxes assume that it is in `XYWH` (x,y co-ordinates and width and height of the box) format. It might not work for other formats like `(x1, y1, x2, y2)`. As a final step of getting familiar with the data, explore it for potential issues. One common problem with datasets for object detection is bounding boxes that "stretch" beyond the edge of the image. Such "runaway" bounding boxes can raise