Inference & Deployment
Last updated
Was this helpful?
Last updated
Was this helpful?
This section covers the deployment and inference of models.
In general, there are 3 different ways you can deploy and apply your trained model:
Supervisely Serving Apps within the Platform. Simple and user-friendly way to deploy a model with convenient web UI.
Deploy with Supervisely Python SDK for automated model inference: Use Supervisely SDK for deploying models and getting predictions in your code.
Using Standalone PyTorch Models: You can always download a plain PyTorch checkpoint and use it outside of Supervisely infrastructure in your code, or download its ONNX / TensorRT exported versions.
This is the most user-friendly variant. Deploy your model via Supervisely Serving Apps, such as Serve YOLOv11, Serve RT-DETRv2, Serve SAM 2.1, then apply a model using Applying Apps, such as Apply NN to Images, Apply NN to video, NN Image Labeling.
See more information in Supervisely Serving Apps.
This method involves using Python code together with Supervisely SDK to automate deployment and getting predictions in different scenarios and environments. You can deploy your models either inside the Supervisely Platform (on an agent), or outside the platform, directly on your local machine.
If you need to completely decouple your code from Supervisely SDK, see Using Standalone PyTorch Models (mostly for data scientists).
In-Platform Model Deployment: When deploying inside Supervisely Platform, your model becomes a part of the complete Supervisely Ecosystem. It becomes visible on the platform, has its own ID through which other applications can interact with it, get predictions, and interact with it for other tasks, combining with a unified ML workflow. The platform will help you by tracking all changes of models and data, save the entire history of your ML operations and experiments for reproducing and exploring. See In-Platform Model Deployment using Supervisely SDK.
Local Deployment: In the case of deployment outside Supervisely Platform, you don't have the advantages of the Ecosystem, but you get more freedom in how your model will be deployed. You can deploy your model on different machines using a single script by yourself. This is a more advanced variant that won't suit everyone. See Local Deployment using Supervisely SDK.
In summary:
In-Platform Model Deployment:
Model becomes integrated into Supervisely Ecosystem
Gets unique ID for platform-wide access
Other applications can interact with your model
Automatic version tracking of models and data
Full ML operations history is preserved for reproduction and analyzing experiments
Easy integration into ML pipelines
Local Deployment:
More flexibility in development
Deploy on any server or machine by yourself
Less integrated with platform, no ecosystem benefits
Advanced option for specific use cases
For each option, there are several ways you can deploy the model. See the In-Platform Deployment and Local Deployment sections for more details.
This is the most advanced way. This method completely decouple you from both Supervisely Platform and Supervisely SDK, and you will develop your own code for inference and deployment. It's also important to understand that for each neural network and framework, you'll need to set up an environment and write inference code by yourself, since each model has its own installation instructions, and the format of inputs and outputs. But, in many cases, we provide our examples of using the model as a standalone PyTorch model. You can find our guidelines on a GitHub repository of the corresponding model. For example, RT-DETRv2 Demo.
See more information in Using Standalone PyTorch Models.