Deploy & Predict with Supervisely SDK
Last updated
Was this helpful?
Last updated
Was this helpful?
This section involves using Python code together with Supervisely SDK to automate deployment and inference in different scenarios and environments. You can deploy your models either inside the Supervisely Platform (on an agent), or outside the platform, directly on your local machine. See the difference in Overview.
In-platform deployment is similar to manually launching a Serving App on the Supervisely Platform. With python SDK you can automate this.
This method only works for your models trained in Supervisely and stored in Team Files. It also requires Supervisely SDK version 6.73.305
or higher.
Here's how to do it:
Install supervisely SDK if not installed.
Go to Team Files and copy the path to your model artifacts (artifacts_dir
).
Run this code to deploy a model on the platform. Don't forget to fill in your workspace_id
and artifacts_dir
.
Any model deployed on the platform (both manually and through the code) works as a service and can accept API requests for inference. After you deployed a model on the platform, connect to it, and get predictions using Session
class:
Learn more about SessionAPI in the Inference API Tutorial.
In this section you will deploy a model locally on your machine, outside of Supervisely Platform. In the case of deployment outside of the platform, you don't have the advantages of the Ecosystem, but you get more freedom in how your model will be used in your code. This is a more advanced variant, it can slightly differ from one model to another, because you need to set up python environment by yourself, but the main code of loading model and getting predictions will be the same.
There are several variants of how you can use a model locally:
Load and Predict in Your Code: Load your checkpoint and get predictions in your code or in a script.
Deploy Model as a Server: Deploy your model as a server on your machine, and interact with it through API requests.
🐋 Deploy in Docker Container: Deploy model as a server in a docker container on your local machine.
Deploy Model as a Serving App with web UI: Deploy model as a server with a web UI and interact with it through API. ❓ - This feature is mostly for debugging and testing purposes.
This example shows how to load your checkpoint and get predictions in any of your code. RT-DETRv2 is used in this example, but the instructions are similar for other models.
Clone our RT-DETRv2 fork with the model implementation.
Install requirements.txt manually, or use our pre-built docker image (DockerHub | Dockerfile). Additionally, you need to install Supervisely SDK.
Download your checkpoint and model files from Team Files.
Create main.py
file in the root of the repository and paste the following code:
This code will load the model, predict the image, and save the result to prediction.jpg
.
If you need to run the code in your project and not in the root of the repository, you can add the path to the repository into PYTHONPATH
, or by the following lines at the beginning of the script:
In this variant, you will deploy a model locally as an API Server with the help of Supervisely SDK. The server will be ready to process API request for inference. It allows you to predict with local images, folders, videos, or remote supervisely projects and datasets (if you provided your Supervisely API token).
Clone our RT-DETRv2 fork with the model implementation.
Install requirements manually, or use our pre-built docker image (DockerHub | Dockerfile).
You can skip this step and pass a remote path to checkpoint in Team Files.
Download your checkpoint, model files and experiment_info.json
from Team Files or the whole artifacts directory.
You can place downloaded files in the folder within app repo, for example you can create models
folder inside root directory of the repository and place all files there.
Your repo should look like this:
To deploy, use main.py
script to start the server. You need to pass the path to your checkpoint file or the name of the pretrained model using --model
argument. Like in the previous example, you need to add the path to the repository into PYTHONPATH
.
This command will start the server on http://0.0.0.0:8000 and will be ready to accept API requests for inference.
If you are a VSCode user you can use the following configurations for your launch.json
file:
After the model is deployed, use Supervisely Inference Session API with setting server address to http://0.0.0.0:8000.
Instead of writing code for inference, you can use CLI arguments to get predictions right after the model is loaded. The following arguments are available:
--model
- (required) a path to your local checkpoint file, or remote path in Team Files. Also, it can be a name of a pre-trained model from models.json file.
--predict-project
- ID of Supervisely project to predict. A new project with predictions will be created on the platform.
--predict-dataset
- ID(s) of Supervisely dataset(s) to predict. A new project with predictions will be created on the platform.
--predict-image
- path to a local image or image ID in Supervisely.
Server will shut down automatically after the prediction is done.
Example usage:
Deploying in a Docker Container is similar to deployment as a Server. This example is useful when you need to run your model on a remote machine or in a cloud environment.
Use this docker run
command to deploy a model in a docker container (RT-DETRv2 example):
Put your path to the checkpoint file in the --model
argument (it can be both the local path or a remote path in Team Files). This will start FastAPI server and load the model for inference. The server will be available on http://localhost:8000.
You can also use docker-compose.yml
file for convenience:
After the model is deployed, you can use the Session
object for inference (Inference Session API) or use CLI arguments to get predictions.
In this case, the model will make predictions right after it is loaded, and the server will shut down automatically after the inference is done.
The following arguments are available:
--model
- (required) a path to your local checkpoint file, or remote path in Team Files. Also, it can be a name of a pre-trained model from models.json file.
--predict-project
- ID of Supervisely project to predict. A new project with predictions will be created on the platform.
--predict-dataset
- ID(s) of Supervisely dataset(s) to predict. A new project with predictions will be created on the platform.
--predict-image
- path to a local image or image ID in Supervisely.
In this variant, you will run a full Serving App with web UI, in which you can deploy a model. This is useful for debugging and testing purposes, for example, when you're integrating your Custom Inference App with the Supervisely Platform.
Follow the steps from the previous section, but instead of running the server, you need to run the following command:
After the app is started, you can open the web UI http://localhost:8000, and deploy a model through the web interface.
Use the same SessionAPI to get predictions with the server address http://localhost:8000.