Validation Schemas

Overview

JSON Schema Validation allows you to enforce a consistent structure for image metadata when uploading images to projects. By defining a JSON schema at the project level, you ensure that all uploaded metadata adheres to the required structure. If the metadata does not match the schema, the upload will be rejected with a clear validation error, preventing inconsistent or incomplete data from entering your project.

Use Case Example

Imagine you're uploading images to a project where each image needs specific metadata:

  • Camera settings (ISO, aperture, shutter speed)

  • Location information (GPS coordinates, address)

  • Quality metrics (brightness, contrast, sharpness)

Without validation, different team members might upload images with inconsistent metadata structure. With JSON schema validation, you ensure all metadata follows the same format.

Setting Up Validation

Step 1: Define Your Schema

Create a JSON schema that defines the required structure for your image metadata. For example:

{
  "type": "object",
  "required": ["camera", "location", "quality"],
  "properties": {
    "camera": {
      "type": "object",
      "required": ["iso", "aperture"],
      "properties": {
        "iso": {"type": "number"},
        "aperture": {"type": "string"},
        "shutter_speed": {"type": "string"}
      }
    },
    "location": {
      "type": "object",
      "required": ["lat", "lng"],
      "properties": {
        "lat": {"type": "number"},
        "lng": {"type": "number"},
        "address": {"type": "string"}
      }
    },
    "quality": {
      "type": "object",
      "properties": {
        "brightness": {"type": "number"},
        "contrast": {"type": "number"}
      }
    }
  }
}

Step 2: Set Schema for Project

Apply the schema to your project using the API:

import supervisely as sly

api = sly.Api.from_env()
project_id = 12345

# Set validation schema for project
api.project.set_validation_schema(project_id, schema)

# Get current validation schema
current_schema = api.project.get_validation_schema(project_id)

Step 3: Upload Images with Validation

When uploading images, enable validation to ensure metadata compliance:

# Upload images with validation enabled
image_paths = ["/path/to/image1.jpg", "/path/to/image2.jpg"]
names = ["image1.jpg", "image2.jpg"]
metas = [
    {
        "camera": {"iso": 800, "aperture": "f/2.8"},
        "location": {"lat": 37.7749, "lng": -122.4194}
    },
    {
        "camera": {"iso": 400, "aperture": "f/1.8"},
        "location": {"lat": 40.7128, "lng": -74.0060}
    }
]

api.image.upload_paths(
    dataset_id=dataset_id,
    names=names,
    paths=image_paths,
    metas=metas,
    validate_meta=True,  # Enable validation
    use_strict_validation=False  # Optional: strict mode
)
``` Valid metadata example:

```json
{
  "camera": {
    "iso": 800,
    "aperture": "f/2.8",
    "shutter_speed": "1/60"
  },
  "location": {
    "lat": 37.7749,
    "lng": -122.4194,
    "address": "San Francisco, CA"
  },
  "quality": {
    "brightness": 0.7,
    "contrast": 0.8
  }
}

Validation Options

Standard Validation

# Standard validation - allows extra fields
api.image.upload_paths(
    dataset_id=dataset_id,
    names=names,
    paths=image_paths,
    metas=metas,
    validate_meta=True,
    use_strict_validation=False  # Default Value
)

Strict Validation

# Strict validation - exact schema match required
api.image.upload_paths(
    dataset_id=dataset_id,
    names=names,
    paths=image_paths,
    metas=metas,
    validate_meta=True,
    use_strict_validation=True  # Set to True for strict validation
)

Optimized Validation with Caching

# Use caching for better performance with multiple uploads
api.image.upload_paths(
    dataset_id=dataset_id,
    names=names,
    paths=image_paths,
    metas=metas,
    validate_meta=True,
    use_caching_for_validation=True  # Schema cached for 1 hour
)

Validating Existing Projects

For projects with existing images, you can validate all current data:

# Validate all existing images in project
validation_result = api.project.validate_entities_schema(project_id)

# Check validation results
if not validation_result:
    print("All entities are valid according to the schema!")
else:
    print(f"Found {len(validation_result)} entities that don't match the schema:")
    
    for entity in validation_result:
        print(f"\nEntity: {entity['entity_name']} (ID: {entity['entity_id']})")
        
        if entity['missing_fields']:
            print(f"  Missing fields: {', '.join(entity['missing_fields'])}")
        
        if entity['extra_fields']:
            print(f"  Extra fields: {', '.join(entity['extra_fields'])}")

This process helps you:

  • Identify non-compliant images in existing projects

  • Get detailed error reports for each failed image

  • Fix metadata issues before enforcing strict validation

Benefits

  • Consistency: All images have the same metadata structure

  • Quality Control: Prevent incomplete or incorrect metadata uploads

  • Team Coordination: Everyone follows the same metadata standards

  • Data Integrity: Maintain clean, structured datasets

  • Error Prevention: Catch metadata issues at upload time

Common Schema Patterns

Simple Required Fields

{
  "type": "object",
  "required": ["timestamp", "source"],
  "properties": {
    "timestamp": {"type": "string"},
    "source": {"type": "string"}
  }
}

Nested Structures

{
  "type": "object",
  "required": ["equipment"],
  "properties": {
    "equipment": {
      "type": "object",
      "required": ["camera_model"],
      "properties": {
        "camera_model": {"type": "string"},
        "lens": {"type": "string"}
      }
    }
  }
}

Optional Fields with Defaults

{
  "type": "object",
  "required": ["image_id"],
  "properties": {
    "image_id": {"type": "string"},
    "quality_checked": {"type": "boolean", "default": false},
    "notes": {"type": "string"}
  }
}

Complete Example

Here's a full workflow example:

import supervisely as sly

# Initialize API
api = sly.Api.from_env()
project_id = 12345
dataset_id = 67890

# 1. Define schema
schema = {
    "type": "object",
    "required": ["camera", "location"],
    "properties": {
        "camera": {
            "type": "object",
            "required": ["iso", "aperture"],
            "properties": {
                "iso": {"type": "number"},
                "aperture": {"type": "string"}
            }
        },
        "location": {
            "type": "object",
            "required": ["lat", "lng"],
            "properties": {
                "lat": {"type": "number"},
                "lng": {"type": "number"}
            }
        }
    }
}

# 2. Set schema for project
api.project.set_validation_schema(project_id, schema)

# 3. Upload images with validation
image_paths = ["/path/to/image1.jpg"]
names = ["image1.jpg"]
metas = [{
    "camera": {"iso": 800, "aperture": "f/2.8"},
    "location": {"lat": 37.7749, "lng": -122.4194}
}]

try:
    api.image.upload_paths(
        dataset_id=dataset_id,
        names=names,
        paths=image_paths,
        metas=metas,
        validate_meta=True
    )
    print("Upload successful - metadata valid!")
except Exception as e:
    print(f"Upload failed: {e}")

Best Practices

  • Start simple: Begin with basic required fields, add complexity gradually

  • Document your schema: Include field descriptions and examples

  • Test thoroughly: Validate your schema with sample data before deployment

  • Version your schemas: Track changes when updating validation rules

  • Communicate changes: Inform team members about new validation requirements

Requirements

  • Supervisely instance version: 6.12.5 or later

  • Supervisely Python SDK: 6.73.228 or later

Last updated

Was this helpful?