Building a Dataset: Object Detection


Overview

After determining that an Object Detector is the right solution for your use case, you'll need to collect and prepare your training data. This guide walks you through the process of building an effective dataset for object detection, using the example of detecting workers without proper protective equipment.

Step 1: Define Your Detection Classes

First, clearly identify what objects your model needs to detect. These classes should be specific and relevant to your use case.

Example: For detecting PPE compliance, you might define classes such as:

  • Person wearing high-visibility vest
  • Person without high-visibility vest

Detection accuracy depends on both your training data and image quality. Larger objects (like vests) are generally easier to detect than smaller items (like safety glasses). Consider the practical limitations when defining your classes.

Step 2: Collect Representative Data

Make sure the dataset you are curating represents real-world scenarios for your use case.
Your dataset must accurately represent real-world scenarios where your model will operate.

Data Collection Considerations:

  • Lighting conditions: Include images with various lighting (bright, dim, natural, artificial)
  • Image quality: Gather images at the resolution your deployed model will use
  • Camera angles: Include multiple perspectives and distances
  • Backgrounds: Capture diverse environments where detection will occur
  • Negative examples: Include images without the target objects
  • Mixed scenarios: Include images containing both compliant and non-compliant examples

For effective PPE compliance detection, collect images showing people both with and without high-visibility vests in environments where your system will operate.

Step 3: Organize Your Dataset

Structure your data into two main folders:

  • Training images (80%): Used to teach the model

  • Testing images (20%): Used to evaluate model performance

Unlike image classification, you don't need to manually create a separate validation folder. Navigator automatically creates validation data using the "Validation Split" setting in the Object Detection Trainer Element. The recommended starting value is 20%.

Step 4: Annotate Your Images

Annotation tells the model exactly what to look for in each image.

Create a second folder inside your Training images folder and call it "annotations".

Annotation Process:

  1. Use an annotation tool that supports bounding boxes and exports to COCO JSON format (RectLabel Pro is recommended for Mac users)

  2. Draw precise bounding boxes around each instance of your defined classes

  3. Keep bounding boxes tight to the edges of objects to help the model learn accurate boundaries

  4. Label each box with the appropriate class name

Best Practices:

  • Be consistent with your annotation style across all images
  • Include all instances of your target objects in each image
  • Ensure annotations are accurate at object boundaries

Once your training images are properly annotated, you're ready to proceed to training your Object Detector model.

Next Steps

After completing these steps, proceed to the Object Detector training process to build your model using this carefully prepared dataset.