Deploy & Predict with Supervisely SDK
This section involves using Python code together with Supervisely SDK to automate deployment and inference in different scenarios and environments. Supervisely also has a convenient Prediction API that allows you to deploy models and get predictions in a couple of lines of code (check our Tutorial). In this tutorial you will deploy models directly on your local machine. This is a more advanced variant, it can slightly differ from one model to another, because you need to set up python environment by yourself, but the main code of loading model and getting predictions will be the same.
There are several approaches of how you can deploy and apply your model locally:
Load and Predict in Your Code: Load your checkpoint and get predictions in your code or in a script.
Deploy Model as a Server: Deploy your model as a server on your machine, and interact with it through API requests.
๐ Deploy in Docker Container: Deploy model as a server in a docker container on your local machine.
Deploy Model as a Serving App with web UI: Deploy model as a server with a web UI and interact with it through API. โ - This feature is mostly for debugging and testing purposes.
Load and Predict in Your Code
This example shows how to load your checkpoint and get predictions in any of your code. RT-DETRv2 is used in this example, but the instructions are similar for other models.
1. Clone repository
Clone our RT-DETRv2 fork with the model implementation.
git clone https://github.com/supervisely-ecosystem/RT-DETRv2
cd RT-DETRv22. Set up environment
Install requirements.txt manually, or use our pre-built docker image (DockerHub | Dockerfile). Additionally, you need to install Supervisely SDK.
pip install -r rtdetrv2_pytorch/requirements.txt
pip install supervisely3. Download checkpoint
Download your checkpoint and model files from Team Files.

4. Predict
Create main.py file in the root of the repository and paste the following code:
This code will load the model, predict the image, and save the result to prediction.jpg.
If you need to run the code in your project and not in the root of the repository, you can add the path to the repository into PYTHONPATH, or by the following lines at the beginning of the script:
Deploy Model as a Server
In this variant, you will deploy a model locally as an API Server with the help of Supervisely SDK. The server will be ready to process API request for inference. It allows you to predict with local images, folders, videos, or remote supervisely projects and datasets (if you provided your Supervisely API token).
1. Clone repository
Clone our RT-DETRv2 fork with the model implementation.
2. Set up environment
Install requirements manually, or use our pre-built docker image (DockerHub | Dockerfile).
3. Download checkpoint (optional)
Download your checkpoint, model files and experiment_info.json from Team Files or the whole artifacts directory.

You can place downloaded files in the folder within app repo, for example you can create models folder inside root directory of the repository and place all files there.
Your repo should look like this:
4. Deploy
To deploy, use main.py script to start the server. You need to pass the path to your checkpoint file or the name of the pretrained model using --model argument. Like in the previous example, you need to add the path to the repository into PYTHONPATH.
This command will start the server on http://0.0.0.0:8000 and will be ready to accept API requests for inference.
Arguments description
mode- (required) mode of operation, can bedeployorpredict.--model- name of a model from pre-trained models table (see models.json), or a path to your custom checkpoint file either local path or remote path in Team Files. If not provided the first model from the models table will be loaded.--device- device to run the model on, can becpuorcuda.--runtime- runtime to run the model on, can bePyTorch,ONNXRuntimeorTensorRTif supported.--settings- inference settings, can be a path to a.json,yaml,ymlfile or a list of key-value pairs e.g.--settings confidence_threshold=0.5.
For custom model use the path to the checkpoint file:
If you are a VSCode user you can use the following configurations for your launch.json file:
5. Predict
After the model is deployed, use Supervisely Inference Session API with setting server address to http://0.0.0.0:8000.
Predict with CLI
Instead of using the Session, you can deploy and predict in one command.
You can predict both local images or data on Supervisely platform. By default predictions will be saved to ./predictions directory, you can change it with --output argument.
To predict data on the platform use one of the following arguments:
--project_id- id of Supervisely project to predict. If use--uploada new project with predictions will be created on the platform.--dataset_id- id(s) of Supervisely dataset(s) to predict e.g.--dataset_id "505,506". If use--uploada new project with predictions will be created on the platform.--image_id- id of Supervisely image to predict. If--uploadis passed, prediction will be added to the provided image.
You can specify additional settings:
--output- a local directory where predictions will be saved.--upload- upload predictions to the platform. Works only with:--project_id,--dataset_id,--image_id.--draw- save image with prediction visualization in--output-dir. Works only with:inputand--image_id.
Example to predict with CLI arguments:
๐ Deploy in Docker Container
Deploying in a Docker Container is similar to deployment as a Server. This example is useful when you need to run your model on a remote machine or in a cloud environment.
First pull docker image to your local machine
Use this docker run command to deploy a model in a docker container (RT-DETRv2 example)
For pretrained:
For custom:
Set your own path to artifacts directory for ./47688_RT-DETRv2 in -v argument and put your path to the checkpoint file in the --model argument (it can be both the local path or a remote path in Team Files). This will start FastAPI server and load the model for inference. The server will be available on http://localhost:8000.
docker compose
You can also use docker-compose.yml file for convenience:
Predict
After the model is deployed, you can use the Session object for inference (Inference Session API) or use CLI arguments to get predictions.
Deploy and Predict with CLI arguments
You can use the same arguments as seen in the previous deploy and predict sections for running docker container.
Example to deploy model as a server:
Example to predict with CLI arguments:
Deploy Model as a Serving App with web UI
In this variant, you will run a full Serving App with web UI, in which you can deploy a model. This is useful for debugging and testing purposes, for example, when you're integrating your Custom Inference App with the Supervisely Platform.
Follow the steps from the previous section, but instead of running the server, you need to run the following command:
Deploy
After the app is started, you can open the web UI http://localhost:8000, and deploy a model through the web interface.
Predict
Use the same SessionAPI to get predictions with the server address http://localhost:8000.
Last updated