# Validation Schemas

## Overview

**JSON Schema** Validation allows you to enforce a consistent structure for image metadata when uploading images to projects. By defining a **JSON schema** at the project level, you ensure that all uploaded metadata adheres to the required structure. If the metadata does not match the schema, the upload will be rejected with a clear validation error, preventing inconsistent or incomplete data from entering your project.

## Use Case Example

Imagine you're uploading images to a project where each image needs specific metadata:

* Camera settings (ISO, aperture, shutter speed)
* Location information (GPS coordinates, address)
* Quality metrics (brightness, contrast, sharpness)

Without validation, different team members might upload images with inconsistent metadata structure. With JSON schema validation, you ensure all metadata follows the same format.

## Setting Up Validation

### Step 1: Define Your Schema

Create a JSON schema that defines the required structure for your image metadata. For example:

```json
{
  "type": "object",
  "required": ["camera", "location", "quality"],
  "properties": {
    "camera": {
      "type": "object",
      "required": ["iso", "aperture"],
      "properties": {
        "iso": {"type": "number"},
        "aperture": {"type": "string"},
        "shutter_speed": {"type": "string"}
      }
    },
    "location": {
      "type": "object",
      "required": ["lat", "lng"],
      "properties": {
        "lat": {"type": "number"},
        "lng": {"type": "number"},
        "address": {"type": "string"}
      }
    },
    "quality": {
      "type": "object",
      "properties": {
        "brightness": {"type": "number"},
        "contrast": {"type": "number"}
      }
    }
  }
}
```

### Step 2: Set Schema for Project

Apply the schema to your project using the API:

```python
import supervisely as sly

api = sly.Api.from_env()
project_id = 12345

# Set validation schema for project
api.project.set_validation_schema(project_id, schema)

# Get current validation schema
current_schema = api.project.get_validation_schema(project_id)
```

### Step 3: Upload Images with Validation

When uploading images, enable validation to ensure metadata compliance:

````python
# Upload images with validation enabled
image_paths = ["/path/to/image1.jpg", "/path/to/image2.jpg"]
names = ["image1.jpg", "image2.jpg"]
metas = [
    {
        "camera": {"iso": 800, "aperture": "f/2.8"},
        "location": {"lat": 37.7749, "lng": -122.4194}
    },
    {
        "camera": {"iso": 400, "aperture": "f/1.8"},
        "location": {"lat": 40.7128, "lng": -74.0060}
    }
]

api.image.upload_paths(
    dataset_id=dataset_id,
    names=names,
    paths=image_paths,
    metas=metas,
    validate_meta=True,  # Enable validation
    use_strict_validation=False  # Optional: strict mode
)
``` Valid metadata example:

```json
{
  "camera": {
    "iso": 800,
    "aperture": "f/2.8",
    "shutter_speed": "1/60"
  },
  "location": {
    "lat": 37.7749,
    "lng": -122.4194,
    "address": "San Francisco, CA"
  },
  "quality": {
    "brightness": 0.7,
    "contrast": 0.8
  }
}
````

## Validation Options

### Standard Validation

```python
# Standard validation - allows extra fields
api.image.upload_paths(
    dataset_id=dataset_id,
    names=names,
    paths=image_paths,
    metas=metas,
    validate_meta=True,
    use_strict_validation=False  # Default Value
)
```

### Strict Validation

```python
# Strict validation - exact schema match required
api.image.upload_paths(
    dataset_id=dataset_id,
    names=names,
    paths=image_paths,
    metas=metas,
    validate_meta=True,
    use_strict_validation=True  # Set to True for strict validation
)
```

### Optimized Validation with Caching

```python
# Use caching for better performance with multiple uploads
api.image.upload_paths(
    dataset_id=dataset_id,
    names=names,
    paths=image_paths,
    metas=metas,
    validate_meta=True,
    use_caching_for_validation=True  # Schema cached for 1 hour
)
```

## Validating Existing Projects

For projects with existing images, you can validate all current data:

```python
# Validate all existing images in project
validation_result = api.project.validate_entities_schema(project_id)

# Check validation results
if not validation_result:
    print("All entities are valid according to the schema!")
else:
    print(f"Found {len(validation_result)} entities that don't match the schema:")
    
    for entity in validation_result:
        print(f"\nEntity: {entity['entity_name']} (ID: {entity['entity_id']})")
        
        if entity['missing_fields']:
            print(f"  Missing fields: {', '.join(entity['missing_fields'])}")
        
        if entity['extra_fields']:
            print(f"  Extra fields: {', '.join(entity['extra_fields'])}")
```

This process helps you:

* Identify non-compliant images in existing projects
* Get detailed error reports for each failed image
* Fix metadata issues before enforcing strict validation

## Benefits

* **Consistency**: All images have the same metadata structure
* **Quality Control**: Prevent incomplete or incorrect metadata uploads
* **Team Coordination**: Everyone follows the same metadata standards
* **Data Integrity**: Maintain clean, structured datasets
* **Error Prevention**: Catch metadata issues at upload time

## Common Schema Patterns

### Simple Required Fields

```json
{
  "type": "object",
  "required": ["timestamp", "source"],
  "properties": {
    "timestamp": {"type": "string"},
    "source": {"type": "string"}
  }
}
```

### Nested Structures

```json
{
  "type": "object",
  "required": ["equipment"],
  "properties": {
    "equipment": {
      "type": "object",
      "required": ["camera_model"],
      "properties": {
        "camera_model": {"type": "string"},
        "lens": {"type": "string"}
      }
    }
  }
}
```

### Optional Fields with Defaults

```json
{
  "type": "object",
  "required": ["image_id"],
  "properties": {
    "image_id": {"type": "string"},
    "quality_checked": {"type": "boolean", "default": false},
    "notes": {"type": "string"}
  }
}
```

## Complete Example

Here's a full workflow example:

```python
import supervisely as sly

# Initialize API
api = sly.Api.from_env()
project_id = 12345
dataset_id = 67890

# 1. Define schema
schema = {
    "type": "object",
    "required": ["camera", "location"],
    "properties": {
        "camera": {
            "type": "object",
            "required": ["iso", "aperture"],
            "properties": {
                "iso": {"type": "number"},
                "aperture": {"type": "string"}
            }
        },
        "location": {
            "type": "object",
            "required": ["lat", "lng"],
            "properties": {
                "lat": {"type": "number"},
                "lng": {"type": "number"}
            }
        }
    }
}

# 2. Set schema for project
api.project.set_validation_schema(project_id, schema)

# 3. Upload images with validation
image_paths = ["/path/to/image1.jpg"]
names = ["image1.jpg"]
metas = [{
    "camera": {"iso": 800, "aperture": "f/2.8"},
    "location": {"lat": 37.7749, "lng": -122.4194}
}]

try:
    api.image.upload_paths(
        dataset_id=dataset_id,
        names=names,
        paths=image_paths,
        metas=metas,
        validate_meta=True
    )
    print("Upload successful - metadata valid!")
except Exception as e:
    print(f"Upload failed: {e}")
```

## Best Practices

* **Start simple**: Begin with basic required fields, add complexity gradually
* **Document your schema**: Include field descriptions and examples
* **Test thoroughly**: Validate your schema with sample data before deployment
* **Version your schemas**: Track changes when updating validation rules
* **Communicate changes**: Inform team members about new validation requirements

## Requirements

* Supervisely instance version: 6.12.5 or later
* Supervisely Python SDK: 6.73.228 or later


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.supervisely.com/data-organization/project-dataset/advanced/validation-schemas.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
