Inference
Supervisely supports a few inference modes. These modes differ from each other in what image area will be fed to the NN.
{
"model": {
"gpu_device": 0
},
"mode": {
"name": "full_image",
"model_classes": {
"save_classes": "__all__",
"add_suffix": "_unet"
}
}
}
model
- group contains unique settings for each model:gpu_device
- device to use for inference. Right now we support only single GPU.
mode
- group contains all mode settings:name
- mode name defines how to apply NN to image (e.g.full_image
- apply NN to full image)model_classes
- which classes will be used, e.g. NN produces 80 classes and you are going to use only a few and ignore others. In that case you should setsave_classes
field with the list of interested class names.add_suffix
string will be added to new class to prevent having same class names as already exisiting classes in the project. If you are going to use all model classes just set"save_classes": "__all__"
.
For example you have a project where each person is segmented (class 'person'). And you are going to apply a new model to this project to also segment humans, but you want to save both annotations: original and new predictions. Here is the appropriate config:
{
"model": {
"gpu_device": 0
},
"mode": {
"name": "full_image",
"model_classes": {
"save_classes": ["person"],
"add_suffix": "_unet"
}
}
}
All new annotations after NN inference will have class
person_unet
.You can apply NN to image areas that are defined by object bounding boxes. Instead of applying NN to the whole image, it will be applied to a few image parts. It may be very usefull when you are going to apply a few NNs in a sequence.
For example, you are going to segment person instances. But you don't want to use Mask-RCNN (due to low segmentation quality near object edges). Instead you apply Faster-RCNN to detect all persons, and then apply your custom UNet model to segment each person.

{
"model": {
"gpu_device": 0
},
"mode": {
"name": "bboxes",
"from_classes": ["person_bbox"],
"padding": {
"left": "5%",
"top": "5%",
"right": "5%",
"bottom": "5%"
},
"save": true,
"add_suffix": "_input_bbox",
"model_classes": {
"save_classes": ["person_bbox"],
"add_suffix": "_unet"
}
}
}
Many of the fields are similar to those that were already described in the "Full image" chapter. Here is the explanation for the new ones.
mode
- group contains all mode settings:name
:bboxes
- apply NN to bounding boxes of specified objectsfrom_classes
- list of classes. All objects of these classes will be used for inference: 1. get object bounding box, 2. apply padding to bounding box (slightly encrease its size), 3. feed the image area that is defined with the bounding box to the NN.padding
- how to increase input bounding box. Possible values examples:"5%"
or"15px"
save
- save input bounding box after inference if set astrue
add_suffix
- suffix for input bounding box
This mode allow to use only a part of the input image.
{
"model": {
"gpu_device": 0
},
"mode": {
"name": "roi",
"bounds": {
"left": "10%",
"top": "50%",
"right": "25%",
"bottom": "0%"
},
"save": false,
"class_name": "inference_roi",
"model_classes": {
"save_classes": "__all__",
"add_suffix": "_unet"
}
}
}
Many of the fields are similar to those that were already described in the "Full image" chapter. Here is the explanation for the new ones.
mode
- group contains all mode settings:name
:roi
- apply NN to the image part defined bybounds
bounds
- defines the relevant part of the input image. Below is the graphical explanation.save
- save input bounding box after inference if set astrue
class_name
- save input bounding box with given class name if the above optionsave
istrue
.

The mode allows to perform inference on (possibly overlapping) parts of the image. It is useful for getting inference results on large or relatively large images without splitting them and stitching.

Sliding window of fixed size moves on the source image and covers it entirely. In each position the model is applied to the part of the image bounded by the window. After processing the whole image results from different positions are aggregated to get the final prediction.

Now we support the modes for image segmentation and detection models only.
Segmentation results are aggregated by averaging output probabilities.
{
"model": {
"gpu_device": 0
},
"mode": {
"name": "sliding_window",
"window": {
"width": 1000,
"height": 750
},
"min_overlap": {
"x": 600,
"y": 600
},
"save": true,
"class_name": "sl_window",
"model_classes": {
"save_classes": "__all__",
"add_suffix": "_unet"
}
}
}
mode
- group contains all mode settings:name
:sliding_window
- selects the sliding window mode for segmentation modelswindow
- defines fixed size of sliding window in pixels; crop with the size will be passed to network (note that in different NN implementations it would be resized to the fixed network input after that )min_overlap
- defines minimal overlap in pixels between different positions of the sliding window; real overlap may be greater to cover the source image entirelysave
- save or not bounds of each sliding window positionclass_name
- class name for the sliding window bounds

Detection results can be merged by special postprocessing algorithm (described below).
{
"model": {
"gpu_device": 0,
"confidence_tag_name": "confidence"
},
"mode": {
"name": "sliding_window_det",
"window": {
"width": 416,
"height": 416
},
"min_overlap": {
"x": 100,
"y": 100
},
"save": true,
"class_name": "sl_window",
"nms_after": {
"enable": true,
"iou_threshold": 0.4,
"confidence_tag_name": "confidence"
},
"model_classes": {
"save_classes": "__all__",
"add_suffix": "_yolo"
}
}
}
Many of the fields are similar to those that were already described in above chapters. Here is the explanation for new ones.
model
- group contains unique settings for each model:confidence_tag_name
- name of the confidence tag for predicted bound boxes.
mode
- group contains all mode settings:name
:sliding_window_det
- selects the sliding window mode for detection modelsnms_after
- defines how to apply Non-Maximum-Suppression postprocessing algorithm:enable
- settrue
to enable postprocessing, orfalse
elsewise.iou_threshold
- minimal Intersection over Union value for merging 2 intersected boxes with same classes.confidence_tag_name
- name of confidence tag. (Most always be equal value ofmodel -> confidence_tag_name
)
Last modified 3yr ago