Supervisely
AboutAPI ReferenceSDK Reference
  • 🤖What's Supervisely
  • 🚀Ecosystem of Supervisely Apps
  • 💡FAQ
  • 📌Getting started
    • How to import
    • How to annotate
    • How to invite team members
    • How to connect agents
    • How to train models
  • 🔁Import and Export
    • Import
      • Overview
      • Import using Web UI
      • Supported annotation formats
        • Images
          • 🤖Supervisely JSON
          • 🤖Supervisely Blob
          • COCO
          • Yolo
          • Pascal VOC
          • Cityscapes
          • Images with PNG masks
          • Links from CSV, TXT and TSV
          • PDF files to images
          • Multiview images
          • Multispectral images
          • Medical 2D images
          • LabelMe
          • LabelStudio
          • Fisheye
          • High Color Depth
        • Videos
          • Supervisely
        • Pointclouds
          • Supervisely
          • .PCD, .PLY, .LAS, .LAZ pointclouds
          • Lyft
          • nuScenes
          • KITTI 3D
        • Pointcloud Episodes
          • Supervisely
          • .PCD, .PLY, .LAS, .LAZ pointclouds
          • Lyft
          • nuScenes
          • KITTI 360
        • Volumes
          • Supervisely
          • .NRRD, .DCM volumes
          • NIfTI
      • Import sample dataset
      • Import into an existing dataset
      • Import using Team Files
      • Import from Cloud
      • Import using API & SDK
      • Import using agent
    • Migrations
      • Roboflow to Supervisely
      • Labelbox to Supervisely
      • V7 to Supervisely
      • CVAT to Supervisely
    • Export
  • 📂Data Organization
    • Core concepts
    • MLOps Workflow
    • Projects
      • Datasets
      • Definitions
      • Collections
    • Team Files
    • Disk usage & Cleanup
    • Quality Assurance & Statistics
      • Practical applications of statistics
    • Operations with Data
      • Data Filtration
        • How to use advanced filters
      • Pipelines
      • Augmentations
      • Splitting data
      • Converting data
        • Convert to COCO
        • Convert to YOLO
        • Convert to Pascal VOC
    • Data Commander
      • Clone Project Meta
  • 📝Labeling
    • Labeling Toolboxes
      • Images
      • Videos 2.0
      • Videos 3.0
      • 3D Point Clouds
      • DICOM
      • Multiview images
      • Fisheye
    • Labeling Tools
      • Navigation & Selection Tools
      • Point Tool
      • Bounding Box (Rectangle) Tool
      • Polyline Tool
      • Polygon Tool
      • Brush Tool
      • Mask Pen Tool
      • Smart Tool
      • Graph (Keypoints) Tool
      • Frame-based tagging
    • Labeling Jobs
      • Labeling Queues
      • Labeling Consensus
      • Labeling Statistics
    • Labeling Performance
    • Labeling with AI-Assistance
  • 🤝Collaboration
    • Admin panel
      • Users management
      • Teams management
      • Server disk usage
      • Server trash bin
      • Server cleanup
      • Server stats and errors
    • Teams & workspaces
    • Members
    • Issues
    • Guides & exams
    • Activity log
    • Sharing
  • 🖥️Agents
    • Installation
      • Linux
      • Windows
      • AMI AWS
      • Kubernetes
    • How agents work
    • Restart and delete agents
    • Status and monitoring
    • Storage and cleanup
    • Integration with Docker
  • 🔮Neural Networks
    • Overview
    • Inference & Deployment
      • Overview
      • Supervisely Serving Apps
      • Deploy & Predict with Supervisely SDK
      • Using trained models outside of Supervisely
    • Model Evaluation Benchmark
      • Object Detection
      • Instance Segmentation
      • Semantic Segmentation
      • Custom Benchmark Integration
    • Custom Model Integration
      • Overview
      • Custom Inference
      • Custom Training
    • Legacy
      • Starting with Neural Networks
      • Train custom Neural Networks
      • Run pre-trained models
  • 👔Enterprise Edition
    • Get Supervisely
      • Installation
      • Post-installation
      • Upgrade
      • License Update
    • Kubernetes
      • Overview
      • Installation
      • Connect cluster
    • Advanced Tuning
      • HTTPS
      • Remote Storage
      • Single Sign-On (SSO)
      • CDN
      • Notifications
      • Moving Instance
      • Generating Troubleshoot Archive
      • Storage Cleanup
      • Private Apps
      • Data Folder
      • Firewall
      • HTTP Proxy
      • Offline usage
      • Multi-disk usage
      • Managed Postgres
      • Scalability Tuning
  • 🔧Customization and Integration
    • Supervisely .JSON Format
      • Project Structure
      • Project Meta: Classes, Tags, Settings
      • Tags
      • Objects
      • Single-Image Annotation
      • Single-Video Annotation
      • Point Cloud Episodes
      • Volumes Annotation
    • Developer Portal
    • SDK
    • API
  • 💡Resources
    • Changelog
    • GitHub
    • Blog
    • Ecosystem
Powered by GitBook
On this page
  • In-Platform Model Deployment
  • 1. Deploy
  • 2. Predict
  • Deploy outside of Supervisely
  • Load and Predict in Your Code
  • Deploy Model as a Server
  • Deploy Model as a Serving App with web UI

Was this helpful?

  1. Neural Networks
  2. Inference & Deployment

Deploy & Predict with Supervisely SDK

PreviousSupervisely Serving AppsNextUsing trained models outside of Supervisely

Last updated 2 months ago

Was this helpful?

This section involves using Python code together with to automate deployment and inference in different scenarios and environments. You can deploy your models either inside the Supervisely Platform (on an agent), or outside the platform, directly on your local machine. See the difference in .

In-Platform Model Deployment

1. Deploy

In-platform deployment is similar to manually launching a on the Supervisely Platform. With python SDK you can automate this.

This method only works for your models trained in Supervisely and stored in Team Files. It also requires Supervisely SDK version 6.73.319 or higher.

Here's how to do it:

  1. Install supervisely SDK if not installed.

pip install supervisely>=6.73.319
  1. Go to Team Files and copy the path to your model artifacts (artifacts_dir).

  1. Run this code to deploy a model on the platform. Don't forget to fill in your workspace_id and artifacts_dir.

import os
import supervisely as sly
from dotenv import load_dotenv

# Ensure you've set API_TOKEN and SERVER_ADDRESS environment variables.
load_dotenv(os.path.expanduser("~/supervisely.env"))

api = sly.Api()

# ⬇ Put your workspace_id and artifacts_dir.
workspace_id = 123
artifacts_dir = "/experiments/27_Lemons/265_RT-DETRv2/"

# Deploy model
task_id = api.task.deploy_custom_model(workspace_id, artifacts_dir)

2. Predict

Any model deployed on the platform (both manually and through the code) works as a service and can accept API requests for inference. After you deployed a model on the platform, connect to it, and get predictions using Session class:

from supervisely.nn.inference import Session

# Create Inference Session
# task_id was returned from the previous code
session = sly.nn.inference.Session(api, task_id=task_id)

# Predict Image
image_id = 123  # ⬅ put your image_id from a platform
prediction = session.inference_image_id(image_id)

# Predict Project
project_id = 123  # ⬅ put your project_id from a platform
predictions = session.inference_project_id(project_id)

Deploy outside of Supervisely

There are several variants of how you can use a model locally:

Load and Predict in Your Code

This example shows how to load your checkpoint and get predictions in any of your code. RT-DETRv2 is used in this example, but the instructions are similar for other models.

1. Clone repository

git clone https://github.com/supervisely-ecosystem/RT-DETRv2
cd RT-DETRv2

2. Set up environment

pip install -r rtdetrv2_pytorch/requirements.txt
pip install supervisely

3. Download checkpoint

Download your checkpoint and model files from Team Files.

4. Predict

Create main.py file in the root of the repository and paste the following code:

import numpy as np
from PIL import Image
import os
import supervisely as sly
from supervisely.nn import ModelSource, RuntimeType

# Be sure you are in the root of the RT-DETRv2 repository
from supervisely_integration.serve.rtdetrv2 import RTDETRv2

# Put your path to image here
IMAGE_PATH = "sample_image.jpg"

# Model config and weights (downloaded from Team Files)
model_files = {
    "checkpoint": "model/rtdetrv2_r18vd_120e_coco_rerun_48.1.pth",
    "config": "model/rtdetrv2_r18vd_120e_coco.yml",
}

# JSON model meta with class names (downloaded from Team Files)
model_meta = sly.io.json.load_json_file("model/model_meta.json")

# Load model
model = RTDETRv2()
model.load_custom_checkpoint(
    model_files=model_files,
    model_meta=model_meta,
    device="cuda",
)

# Load image
image = Image.open(IMAGE_PATH).convert("RGB")
img = np.array(image)

# Predict
ann = model.inference(img, settings={"confidence_threshold": 0.5})

# Draw predictions
ann.draw_pretty(img)
Image.fromarray(img).save("prediction.jpg")

This code will load the model, predict the image, and save the result to prediction.jpg.

If you need to run the code in your project and not in the root of the repository, you can add the path to the repository into PYTHONPATH, or by the following lines at the beginning of the script:

import sys
sys.path.append("/path/to/RT-DETRv2")

Deploy Model as a Server

In this variant, you will deploy a model locally as an API Server with the help of Supervisely SDK. The server will be ready to process API request for inference. It allows you to predict with local images, folders, videos, or remote supervisely projects and datasets (if you provided your Supervisely API token).

1. Clone repository

git clone https://github.com/supervisely-ecosystem/RT-DETRv2
cd RT-DETRv2

2. Set up environment

pip install -r rtdetrv2_pytorch/requirements.txt
pip install supervisely

3. Download checkpoint (optional)

You can skip this step and pass a remote path to checkpoint in Team Files.

Download your checkpoint, model files and experiment_info.json from Team Files or the whole artifacts directory.

You can place downloaded files in the folder within app repo, for example you can create models folder inside root directory of the repository and place all files there.

Your repo should look like this:

📦app-repo-root
 ┣ 📂models
 ┃ ┗ 📂392_RT-DETRv2
 ┃   ┣ 📂checkpoints
 ┃   ┃ ┗ 🔥best.pth
 ┃   ┣ 📜experiment_info.json
 ┃   ┣ 📜model_config.yml
 ┃   ┗ 📜model_meta.json
 ┗ ... other app repository files

4. Deploy

To deploy, use main.py script to start the server. You need to pass the path to your checkpoint file or the name of the pretrained model using --model argument. Like in the previous example, you need to add the path to the repository into PYTHONPATH.

PYTHONPATH="${PWD}:${PYTHONPATH}" \
python ./supervisely_integration/serve/main.py deploy \
--model "./models/392_RT-DETRv2/checkpoints/best.pth"

Arguments description

  • mode - (required) mode of operation, can be deploy or predict.

  • --device - device to run the model on, can be cpu or cuda.

  • --runtime - runtime to run the model on, can be PyTorch, ONNXRuntime or TensorRT if supported.

  • --settings - inference settings, can be a path to a .json, yaml, yml file or a list of key-value pairs e.g. --settings confidence_threshold=0.5.

PYTHONPATH="${PWD}:${PYTHONPATH}" \
python ./supervisely_integration/serve/main.py deploy \
  --model "RT-DETRv2-S" \
  --device cuda \
  --settings confidence_threshold=0.5

For custom model use the path to the checkpoint file:

PYTHONPATH="${PWD}:${PYTHONPATH}" \
python ./supervisely_integration/serve/main.py deploy \
  --model "./models/392_RT-DETRv2/checkpoints/best.pth" \
  --device cuda \
  --settings confidence_threshold=0.5

If you are a VSCode user you can use the following configurations for your launch.json file:

.vscode/launch.json
{
    "version": "0.2.0",
    "configurations": [
    {
      "name": "Local Deploy with local directory",
      "type": "debugpy",
      "request": "launch",
      "program": "${workspaceFolder}/supervisely_integration/serve/main.py",
      "console": "integratedTerminal",
      "justMyCode": false,
      "args": [
        "deploy",
        "--model",
        "models/392_RT-DETRv2/checkpoints/best.pth",
      ],
      "env": {
        "PYTHONPATH": "${workspaceFolder}:${PYTHONPATH}",
        "LOG_LEVEL": "DEBUG"
      }
    },
    {
      "name": "Local Deploy with remote directory",
      "type": "debugpy",
      "request": "launch",
      "program": "${workspaceFolder}/supervisely_integration/serve/main.py",
      "console": "integratedTerminal",
      "justMyCode": false,
      "args": [
        "deploy",
        "--model",
        "/experiments/27_Lemons/392_RT-DETRv2/checkpoints/best.pth",
      ],
      "env": {
        "PYTHONPATH": "${workspaceFolder}:${PYTHONPATH}",
        "LOG_LEVEL": "DEBUG",
        "TEAM_ID": "4",
      }
    }
    ]
}

5. Predict

import os
from dotenv import load_dotenv
import supervisely as sly

load_dotenv(os.path.expanduser("~/supervisely.env"))
api = sly.Api()

# Create Inference Session
session = sly.nn.inference.Session(api, session_url="http://0.0.0.0:8000")

# local image
pred = session.inference_image_path("image_01.jpg")

# batch of images
pred = session.inference_image_paths(["image_01.jpg", "image_02.jpg"])

# remote image on the platform
pred = session.inference_image_id(17551748)
pred = session.inference_image_ids([17551748, 17551750])

# image url
url = "https://images.unsplash.com/photo-1674552791148-c756b0899dba?ixlib=rb-4.0.3&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=387&q=80"
pred = session.inference_image_url(url)

Predict with CLI

Instead of using the Session, you can deploy and predict in one command.

PYTHONPATH="${PWD}:${PYTHONPATH}" \
python ./supervisely_integration/serve/main.py predict "./image.jpg"

You can predict both local images or data on Supervisely platform. By default predictions will be saved to ./predictions directory, you can change it with --output argument.

To predict data on the platform use one of the following arguments:

  • --project_id - id of Supervisely project to predict. If use --upload a new project with predictions will be created on the platform.

  • --dataset_id - id(s) of Supervisely dataset(s) to predict e.g. --dataset_id "505,506". If use --upload a new project with predictions will be created on the platform.

  • --image_id - id of Supervisely image to predict. If --upload is passed, prediction will be added to the provided image.

You can specify additional settings:

  • --output - a local directory where predictions will be saved.

  • --upload - upload predictions to the platform. Works only with: --project_id, --dataset_id, --image_id.

  • --draw - save image with prediction visualization in --output-dir. Works only with: input and --image_id.

Example to predict with CLI arguments:

PYTHONPATH="${PWD}:${PYTHONPATH}" \
python ./supervisely_integration/serve/main.py predict \
  "./image.jpg" \
  --model "RT-DETRv2-S" \
  --device cuda \
  --settings confidence_threshold=0.5

Server will shut down automatically after the prediction is done.

🐋 Deploy in Docker Container

Deploying in a Docker Container is similar to deployment as a Server. This example is useful when you need to run your model on a remote machine or in a cloud environment.

Use this docker run command to deploy a model in a docker container (RT-DETRv2 example):

docker run \
  --shm-size=1g \
  --runtime=nvidia \
  --env-file ~/supervisely.env \
  --env PYTHONPATH=/app \
  -v ".:/app" \
  -w /app \
  -p 8000:8000 \
  supervisely/rt-detrv2:1.0.11 \
  python3 supervisely_integration/serve/main.py deploy \
  --model "/experiments/27_Lemons/392_RT-DETRv2/checkpoints/best.pth"

docker compose

You can also use docker-compose.yml file for convenience:

services:
  rtdetrv2:
    image: supervisely/rt-detrv2:1.0.11
    shm_size: 1g
    runtime: nvidia
    env_file:
      - ~/supervisely.env # Optional, use only for predictions on the platform
    environment:
      - PYTHONPATH=/app
    volumes:
      - .:/app
    working_dir: /app
    ports:
      - "8000:8000"
    expose:
      - "8000"
    entrypoint: [ "python3", "supervisely_integration/serve/main.py" ]
    command: [ "deploy", "--model", "./models/392_RT-DETRv2/checkpoints/best.pth" ]

Predict

Deploy and Predict with CLI arguments

Example to deploy model as a server:

docker run \
  --shm-size=1g \
  --runtime=nvidia \
  --env-file ~/supervisely.env \
  --env PYTHONPATH=/app \
  -v ".:/app" \
  -w /app \
  -p 8000:8000 \
  supervisely/rt-detrv2:1.0.11 \
  python3 supervisely_integration/serve/main.py deploy \
  --model "RT-DETRv2-S"

Example to predict with CLI arguments:

docker run \
  --shm-size=1g \
  --runtime=nvidia \
  --env-file ~/supervisely.env \
  --env PYTHONPATH=/app \
  -v ".:/app" \
  -w /app \
  -p 8000:8000 \
  supervisely/rt-detrv2:1.0.11 \
  python3 supervisely_integration/serve/main.py \
  predict "./image.jpg" \
  --model "RT-DETRv2-S" \
  --device cuda \
  --settings confidence_threshold=0.5

Container will be stopped automatically after the prediction is done.

Deploy Model as a Serving App with web UI

Deploy

uvicorn main:model.app --app-dir supervisely_integration/serve --host 0.0.0.0 --port 8000 --ws websockets

Predict

import os
from dotenv import load_dotenv
import supervisely as sly

load_dotenv(os.path.expanduser("~/supervisely.env"))
api = sly.Api()

# Create Inference Session
session = sly.nn.inference.Session(api, session_url="http://localhost:8000")

# local image
pred = session.inference_image_path("image_01.jpg")

# batch of images
pred = session.inference_image_paths(["image_01.jpg", "image_02.jpg"])

# remote image on the platform
pred = session.inference_image_id(17551748)
pred = session.inference_image_ids([17551748, 17551750])

# image url
url = "https://images.unsplash.com/photo-1674552791148-c756b0899dba?ixlib=rb-4.0.3&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=387&q=80"
pred = session.inference_image_url(url)

Learn more about SessionAPI in the .

In this section you will deploy a model locally on your machine, outside of Supervisely Platform. In the case of deployment outside of the platform, you don't have the , but you get more freedom in how your model will be used in your code. This is a more advanced variant, it can slightly differ from one model to another, because you need to set up python environment by yourself, but the main code of loading model and getting predictions will be the same.

: Load your checkpoint and get predictions in your code or in a script.

: Deploy your model as a server on your machine, and interact with it through API requests.

: Deploy model as a server in a docker container on your local machine.

: Deploy model as a server with a web UI and interact with it through API. ❓ - This feature is mostly for debugging and testing purposes.

Clone our fork with the model implementation.

Install manually, or use our pre-built docker image ( | ). Additionally, you need to install Supervisely SDK.

Clone our fork with the model implementation.

Install manually, or use our pre-built docker image ( | ).

This command will start the server on and will be ready to accept API requests for inference.

--model - name of a model from pre-trained models table (see ), or a path to your custom checkpoint file either local path or remote path in Team Files. If not provided the first model from the models table will be loaded.

After the model is deployed, use Supervisely with setting server address to .

Put your path to the checkpoint file in the --model argument (it can be both the local path or a remote path in Team Files). This will start FastAPI server and load the model for inference. The server will be available on .

After the model is deployed, you can use the Session object for inference () or use CLI arguments to get predictions.

You can use the same arguments as seen in the previous and sections for running docker container.

In this variant, you will run a full with web UI, in which you can deploy a model. This is useful for debugging and testing purposes, for example, when you're integrating your with the Supervisely Platform.

Follow the steps from the section, but instead of running the server, you need to run the following command:

After the app is started, you can open the web UI , and deploy a model through the web interface.

Use the same to get predictions with the server address .

🔮
Inference API Tutorial
RT-DETRv2
requirements.txt
DockerHub
Dockerfile
RT-DETRv2
requirements
DockerHub
Dockerfile
http://0.0.0.0:8000
Inference Session API
http://0.0.0.0:8000
http://localhost:8000
Inference Session API
Serving App
Custom Inference App
http://localhost:8000
SessionAPI
http://localhost:8000
Load and Predict in Your Code
Deploy Model as a Server
🐋 Deploy in Docker Container
Deploy Model as a Serving App with web UI
deploy
predict
previous
Supervisely SDK
Serving App
advantages of the Ecosystem
Overview
Copy path to artifacts dir
Download checkpoint from Team Files
Download checkpoint from Team Files
models.json