Object Detection with YOLO26
Overview
YOLO26 is the latest version of YOLO model family developed by Ultralytics. It demonstrates decent accuracy in real-time computer vision tasks and proposes accessible deployment. YOLO26 is integrated into Supervisely Ecosystem in the form of Train YOLO v8-26 and Serve YOLO v8-26 apps - for training and inference respectively.

What's New
DFL Elimination
While effective, the Distribution Focal Loss (DFL) module added complexity during export and reduced hardware compatibility. YOLO26 removes DFL entirely, streamlining inference and expanding support for edge and low-power devices.
True End-to-End, NMS-Free Inference
Unlike conventional detectors that depend on NMS as a post-processing step, YOLO26 operates as a fully end-to-end model. It produces predictions directly, lowering latency and enabling faster, lighter, and more robust production deployment.
ProgLoss + STAL
Enhanced loss designs boost overall detection accuracy, with especially strong gains in small-object detection—crucial for IoT, robotics, aerial imagery, and other edge-focused use cases.
MuSGD Optimizer
YOLO26 introduces MuSGD, a hybrid optimizer that blends SGD with Muon. Inspired by Moonshot AI’s Kimi K2, it brings advanced optimization techniques from LLM training into computer vision, resulting in more stable training and quicker convergence.
Up to 43% Faster CPU Inference
Purpose-built for edge computing, YOLO26 achieves major CPU inference speedups, delivering real-time performance even on devices without GPUs.
Enhanced Instance Segmentation
Adds semantic segmentation loss to improve convergence and upgrades the proto module to exploit multi-scale features, producing higher-quality segmentation masks.
Training YOLO26 in Supervisely
This section shows how to train YOLO26 using Train YOLO v8–26 app inside Supervisely Ecosystem.
Prerequisites
A Supervisely project with annotated images for object detection or instance segmentation.
A running agent with GPU access (recommended for training).
We will take Surgical Tools dataset from Dataset Ninja as an example.

Step 1: Launch the Training App
Open App Ecosystem.
Choose Train YOLO v8 – 26 app.
Select images project for model training.

Alternatively, you can launch training app from context menu of your images project:

Step 2: Choose YOLO26 Model Variant and Task Type
Choose a YOLO26 pretrained checkpoint, either from Ultralytics (COCO) or from your previous experiment in Team Files, adjusting the variant to your preferred balance of speed, accuracy, and hardware performance.

Step 3: Select Annotation Classes for Training
Pick the subset of classes that YOLO26 should learn (you can train on all or only a part of total).

Step 4: Configure Dataset Split
Select suitable data split method (random / based on item tags / based on datasets / based on collections).

Step 5: Configure Training Hyperparameters
Set training parameters such as batch size, number of epochs, learning rate and many others.

Step 6: Run Training
Enter an experiment name and click Start to launch training. Monitor training progress epoch by epoch in app UI.

Click Open Tensorboard to monitor key performance metrics and loss function values.

Once training finishes, links to checkpoints and logs will appear in app UI - they are stored under Team Files → experiments → <your_experiment_name>.

If user has enabled model benchmark option in training settings, then model performance evaluation report will be generated in the end of the training session.
Model benchmark will generate comprehensive report on model performance from different angles: general metrics, per class metrics, optimal confidence threshold estimation and much more.

Recall vs. Precision
This section compares Precision and Recall in one graph, identifying imbalance between these two.
Bars in the chart are sorted by F1-score to keep a unified order of classes between different charts.

Frequently Confused Classes
This chart displays the most frequently confused pairs of classes. In general, it finds out which classes visually seem very similar to the model.
The chart calculates the probability of confusion between different pairs of classes. For instance, if the probability of confusion for the pair "straight mayo scissor - curved mayo scissor" is 0.17, this means that when the model predicts either "straight mayo scissor" or "curved mayo scissor", there is a 17.0% chance that the model might mistakenly predict one instead of the other.
The measure is class-symmetric, meaning that the probability of confusing a straight mayo scissor with a curved mayo scissor is equal to the probability of confusing a curved mayo scissor with a straight mayo scissor.

Confidence Score Profile
This section is going deeper in analyzing confidence scores. It gives you an intuition about how these scores are distributed and helps to find the best confidence threshold suitable for your task or application.
F1-optimal confidence threshold = 0.2737

You can find more information about Supervisely model benchmark here.
Deploying YOLO26 as a REST API Service
Now, when we got custom YOLO26 checkpoint, we can use the Serve YOLO v8–26 app to deploy YOLO26 models as a REST API service.
Step 1: Launch the Serve App
Run Serve YOLO v8 – 26 from the Ecosystem choosing the target agent (GPU or CPU) that will host the model service.

Step 2: Select Model and Runtime
Choose one either model pretrained on COCO dataset

Or custom checkpoint fine-tuned on one of your own dataset.

Pick the runtime engine:
Pytorch - classic runtime for ML models
ONNXRuntime - acts like a universal translator for ML models, useful if you want framework-agnostic deployment
TensorRT - high-performance inference runtime which optimizes models specifically for NVIDIA GPUs
Step 3: Deploy Selected Model
Click Serve and wait for the service status to become running in the app UI.

Using Trained YOLO26 Model Inside Supervisely
Supervisely Ecosystem provides convenient apps for labeling data with trained neural networks. You can use Predict App to label your data with both pretrained and custom neural networks.
Step 1: Launch Predict App
Run Predict app from the Ecosystem

Step 2: Select Input Data
Select datasets from images project to which model will be applied

Step 3: Connect to Served Model
Select app session with custom YOLO26 model:

Step 4: Select Classes
Select classes which will be used for prediction, other classes will be ignored:

Step 5: Set Inference Settings
Select inference settings (like confidence threshold) and run Preview:

Step 6: Enter Output Project Name
Write output project name and press Run. After inference will be finished, link to labeled project will appear in app UI.

Now you can open labeled project with images and check how your trained model performed:

Using Trained YOLO26 Model Outside Supervisely
After you've trained a model in Supervisely, you can download the checkpoint from Team Files and use it as a simple PyTorch model without Supervisely Platform.
Quick start:
Set up environment.
Install requirements manually, or use our pre-built docker image from DockerHub.
Clone YOLO repository with model implementation.
Download your checkpoint from Supervisely Platform.
Run inference. Refer to our demo scripts:
Step 1: Set Up Environment
Manual Installation
Using Docker Image (advanced)
We provide a pre-built docker image with all dependencies installed DockerHub. The image includes installed packages for ONNXRuntime and TensorRT inference.
See our Dockerfile for more details.
Docker image already includes the source code.
Step 2: Prepare Checkpoint and Model Files
Go to Team Files in Supervisely Platform and download the files.

For YOLO, you need to download only the checkpoint file.
For PyTorch inference: models can be found in the
checkpointsfolder in Team Files after training.For ONNXRuntime and TensorRT inference: models can be found in the
exportfolder in Team Files after training. If you don't see theexportfolder, please ensure that the model was exported toONNXorTensorRTformat during training.
Step 3: Run Inference
Here are demo scripts to run inference with your checkpoints in PyTorch, ONNX or TensorRT runtimes:
Last updated