Custom Inference
Overview
In this guide, you'll learn how to build a custom Serving App using the Supervisely SDK. By integrating your own model, you'll be able to deploy it on the Supervisely platform (or externally). In other words, you'll transform your model into a serving app that's ready to be used in production.
Key Features:
Easily Serve Your Model: Run inference on your own data through Supervisely platform or locally.
Customize Your Solution: Extend a Supervisely SDK class and implement the core methods needed for your custom inference solution.
Debug and Release: Test locally, debug quickly, and deploy your app for production use.
Step-by-Step Implementation
To integrate your custom model into the Supervisely platform, follow these steps:
Step 1. Prepare Model Configurations: Create
models.json
file with model configurations and checkpoints.Step 2. Prepare Inference Settings: Create
inference_settings.yaml
file to define a set of parameters used for inference.Step 3. Prepare App Options: Create a
app_options.yaml
file to specify additional options for your app.Step 4. Create Inference Class: Create a python file that contains your custom inference class.
Step 5. Implement Required Methods: Implement the
load_model
andpredict
methods.Step 6. Create Main Script: Create an entrypoint python script to run and serve your model.
Implementation Example
Step 1. Prepare Model Configurations
If you plan to use pretrained checkpoints (e.g., pretrained YOLO checkpoints), you need to create a models.json
file containing model configurations and weights. This JSON file consists of a list of dictionaries, each detailing a specific model and its checkpoint. The information from this file will populate a table in your app's GUI, allowing users to select a model for inference.
If you only plan to use checkpoints trained in Supervisely with your Custom Training App, you don't need to create this file.
Example models.json
Example GUI preview:
Table Fields
Each dictionary item in models.json
represents a single model as a row, with all its fields, except for the meta
field, acting as columns. You can customize these fields to display the necessary information about your checkpoints.
Technical Field (meta
)
meta
)Each model configuration must have a meta
field. This field is not displayed in the table but contains essential information required by the Inference
class to properly download checkpoints and load the model for inference.
Here are the required fields:
(required)
task_type
: A computer vision task type (e.g., object detection).(required)
model_name
: Model configuration name.(required)
model_files
: A dict with files needed to load the model, such as model weights, config file. You can extend it with additional files if needed.(required)
checkpoint
: Path or URL to the model checkpoint. URL will be downloaded automatically.(optional)
config
: Path to the model configuration file.(optional) Any additional files can be added to the
model_files
dictionary that are required for your model.
Step 2. Prepare Inference Settings
Create an inference_settings.yaml
file to define a set of parameters used for inference.
Example inference_settings.yaml
:
Step 3. Prepare App Options
By default, the inference app supports two sources of model checkpoints: pretrained checkpoints listed in models.json
and custom checkpoints trained in Supervisely. If you don't plan to support both, you can disable one in the app_options.yaml
file.
The app_options.yaml
file allows you to customize your app. You can enable or disable the pretrained models tab, the custom models tab, and specify supported runtimes, which let users choose a runtime for inference (such as ONNXRuntime or TensorRT).
Example app_options.yaml
:
Available options:
pretrained_models
– Enables the pretrained models tab in the GUI. These are the checkpoints provided inmodels.json
. (Default:True
)custom_models
– Enables the custom models tab in the GUI. These are the checkpoints trained in Supervisely using a corresponding training app. (Default:True
)supported_runtimes
– Defines a list of runtimes the app supports. Available runtimes:pytorch
,onnx
,tensorrt
. (Default: `["pytorch"]")
Step 4. Create Inference Class
Create a python file (e.g., src/custom_yolo.py
) that contains your custom inference class with implementation.
Example custom_yolo.py:
Inheritance
Your custom class should inherit from the appropriate Supervisely base class, depending on Computer Vision task your model solves. For example, if you're working on an object detection model, you should inherit from sly.nn.inference.ObjectDetection
.
Available classes for inheritance
ObjectDetection
InstanceSegmentation
SemanticSegmentation
PoseEstimation
ObjectDetection3D
InteractiveSegmentation
SalientObjectSegmentation
Tracking
PromptBasedObjectDetection
PromptableSegmentation
Each of these classes implements a logic for converting model predictions (sly.nn.Prediction
objects) to Supervisely Annotation format (sly.Annotation
).
If there is no suitable class for your task, you can inherit from the base class sly.nn.inference.Inference
and implement the methods responsible for converting predictions to Supervisely format. See the section Custom Task Type.
Class Variables
In your custom class, define class variables to specify the model framework, paths to model configurations (models.json
), and inference settings (inference_settings.yaml
).
Class variables:
FRAMEWORK_NAME:
Name of your model's framework or architecture.MODELS:
Path to yourmodels.json
file.INFERENCE_SETTINGS:
Path to yourinference_settings.yaml
settings file.APP_OPTIONS:
(Optional) Path toapp_options.yaml
file for additional customization.
Step 5. Implement Required Methods
The load_model
Method
load_model
MethodThis method loads the model checkpoint and prepares it for inference. It running after the user selected a model and clicked the "SERVE" button in the GUI.
Let's break down the load_model
parameters. These parameters contains all the necessary information to load your model and weights:
model_files
: A dictionary containing paths to the files of a selected model. It will have the same fields as inmodel_files
from yourmodels.json
. All paths are local paths, and URLs are downloaded automatically.model_info
: A dictionary containing information about the selected model configuration. If the user selected a pretrained checkpoint, the fields are come frommodels.json
, otherwise this will be a dict of experiment info from custom model that was trained in Supervisely.model_source
: The source of the model (Pretrained models
orCustom model
). This can be used to determine where the model checkpoint is coming from and help to load the model properly.device
: The device the user selected in the GUI (e.g.,cpu
,cuda
,cuda:1
).runtime
: The runtime the uses selected for inference (e.g.,pytorch
,onnx
).
The predict
Method
predict
MethodThis method pre-processes the input image, runs inference, and then post-processes the outputs to the established format for predictions.
Preprocess the input image: Read the image, resize it, normalize it, and convert it to a tensor, or do whatever preprocessing is necessary for your model.
Run the model inference: Pass the preprocessed image through the model and get the raw outputs.
Postprocess the outputs: Convert the raw outputs to Supervisely prediction objects. The
sly.nn.Prediction
is the base class for this. Depending on your CV task, use the appropriate subclass:sly.nn.PredictionBBox
,sly.nn.PredictionMask
, etc.
Here is the list of available subclasses of sly.nn.Prediction
for different computer vision tasks:
Object Detection
sly.nn.PredictionBBox
Instance Segmentation
sly.nn.PredictionMask
Semantic Segmentation
sly.nn.PredictionSegmentation
Pose Estimation
sly.nn.PredictionKeypoints
Object Detection 3D
sly.nn.PredictionCuboid3d
Interactive Segmentation
sly.nn.PredictionMask
Tracking
sly.nn.PredictionBBox
If no suitable subclass is available, you can create your own Prediction class by inheriting from sly.nn.Prediction
and convert outputs to this class. Also, you need to override additional methods in your Inference
class. See the section Custom Task Type.
Custom Task Type
If your model solves a computer vision task that is not covered by the available Inference subclasses, you had to implement additional methods responsible for converting predictions to Supervisely format and create your own Prediction class inheriting from sly.nn.Prediction
.
Here is the methods you need to implement:
get_info
- add your"task type"
to the dict (see example code)._get_obj_class_shape
- Specify the basic geometry class of what your model predicts (e.g.,sly.Rectangle
,sly.Bitmap
, etc.)._create_label
- This method takes a single predicted object (e.g, bbox) from a list of predictions returned by yourpredict
method. It must convert an object to a supervisely label (sly.Label
). The single predicted object is an object of your customsly.nn.Prediction
, and you need to convert it to asly.Label
.
Step 6. Create main.py script
Create an entrypoint script (src/main.py
) that runs when the app starts. This script initializes your inference class and launches a FastAPI server using the model.serve()
method.
Example main.py
Run and Debug Your App
You can easily debug your code locally in your favorite IDE.
We recommend using Visual Studio Code IDE, because our repositories have prepared settings for convenient debugging in VSCode. It is the easiest way to start.
For VS Code users
You can use the following launch.json
configuration to run and debug your app locally (place it in the .vscode
directory):
Run the app locally
Run the code in the VSCode debugger by selecting the Uvicorn Serve
configuration. This will start the app on http://localhost:8000.
You may need to install additional packages to debug the app locally:
Shell command to run the app:
If everything is set up correctly, you should be able to open the app in your browser at http://localhost:8000.
Test that inference works correctly
Serve your model by clicking the "SERVE" button in the GUI. After this, run the following code to test the model inference via API using SessionJSON
class (see more details in Inference API Tutorial).
Releasing Your App
Once you've tested the code, it's time to release it into the platform. It can be released as an App that shared with the all Supervisely community, or as your own private App.
Refer to How to Release your App for all releasing details. For a private app check also Private App Tutorial.
In this tutorial we'll quickly observe the key concepts of our app.
Repository structure
The structure of repository is the following:
Explanation:
src/main.py
- main inference scriptsrc/models.json
- file with model configurationssrc/inference_settings.yaml
- file with inference settingssrc/app_options.yaml
- file with additional app optionsREADME.md
- readme of your application, it is the main page of an application in Ecosystem with some images, videos, and how-to-use guidesconfig.json
- configuration of the Supervisely application, which defines the name and description of the app, its context menu, icon, poster, and running settingscreate_venv.sh
- creates a virtual environment, installs detectron2 and requirements.requirements.txt
- all needed packages, avoid using this file if possible, we recommend to install all dependencies in the Dockerfile.local.env
- file with env variables used for debuggingdocker
- directory with the custom Dockerfile for this application and the script that builds it and publishes it to the docker registry
App configuration
App configuration is stored in config.json
file. A detailed explanation of all possible fields is covered in this Configuration Tutorial. Let's check the config for our current app:
Here is an explanation for the fields:
type
- type of the module in Supervisely Ecosystemversion
- version of Supervisely App Engine. Just keep it by defaultname
- the name of the applicationdescription
- the description of the applicationcategories
- these tags are used to place the application in the correct category in Ecosystem.session_tags
- these tags will be assigned to every running session of the application. They can be used by other apps to find and filter all running sessions"need_gpu": true
- should be true if you want to use anycuda
devices."community_agent": false
- this means that this app can not be run on the agents started by Supervisely team, so users have to connect their own computers and run the app only on their own agents. Only applicable in Community Edition. Enterprise customers use their private instances so they can ignore current optiondocker_image
- Docker container will be started from the defined Docker image, github repository will be downloaded and mounted inside the container.entrypoint
- the command that starts our application in a containerport
- port inside the container"headless": true
means that the app has no User Interface
Last updated
Was this helpful?