Supervisely
AboutAPI ReferenceSDK Reference
  • ๐Ÿค–What's Supervisely
  • ๐Ÿš€Ecosystem of Supervisely Apps
  • ๐Ÿ’กFAQ
  • ๐Ÿ“ŒGetting started
    • How to import
    • How to annotate
    • How to invite team members
    • How to connect agents
    • How to train models
  • ๐Ÿ”Import and Export
    • Import
      • Overview
      • Import using Web UI
      • Supported annotation formats
        • Images
          • ๐Ÿค–Supervisely JSON
          • ๐Ÿค–Supervisely Blob
          • COCO
          • Yolo
          • Pascal VOC
          • Cityscapes
          • Images with PNG masks
          • Links from CSV, TXT and TSV
          • PDF files to images
          • Multiview images
          • Multispectral images
          • Medical 2D images
          • LabelMe
          • LabelStudio
          • Fisheye
          • High Color Depth
        • Videos
          • Supervisely
        • Pointclouds
          • Supervisely
          • .PCD, .PLY, .LAS, .LAZ pointclouds
          • Lyft
          • nuScenes
          • KITTI 3D
        • Pointcloud Episodes
          • Supervisely
          • .PCD, .PLY, .LAS, .LAZ pointclouds
          • Lyft
          • nuScenes
          • KITTI 360
        • Volumes
          • Supervisely
          • .NRRD, .DCM volumes
          • NIfTI
      • Import sample dataset
      • Import into an existing dataset
      • Import using Team Files
      • Import from Cloud
      • Import using API & SDK
      • Import using agent
    • Migrations
      • Roboflow to Supervisely
      • Labelbox to Supervisely
      • V7 to Supervisely
      • CVAT to Supervisely
    • Export
  • ๐Ÿ“‚Data Organization
    • Core concepts
    • MLOps Workflow
    • Projects
      • Datasets
      • Definitions
      • Collections
    • Team Files
    • Disk usage & Cleanup
    • Quality Assurance & Statistics
      • Practical applications of statistics
    • Operations with Data
      • Data Filtration
        • How to use advanced filters
      • Pipelines
      • Augmentations
      • Splitting data
      • Converting data
        • Convert to COCO
        • Convert to YOLO
        • Convert to Pascal VOC
    • Data Commander
      • Clone Project Meta
  • ๐Ÿ“Labeling
    • Labeling Toolboxes
      • Images
      • Videos 2.0
      • Videos 3.0
      • 3D Point Clouds
      • DICOM
      • Multiview images
      • Fisheye
    • Labeling Tools
      • Navigation & Selection Tools
      • Point Tool
      • Bounding Box (Rectangle) Tool
      • Polyline Tool
      • Polygon Tool
      • Brush Tool
      • Mask Pen Tool
      • Smart Tool
      • Graph (Keypoints) Tool
      • Frame-based tagging
    • Labeling Jobs
      • Labeling Queues
      • Labeling Consensus
      • Labeling Statistics
    • Labeling with AI-Assistance
  • ๐ŸคCollaboration
    • Admin panel
      • Users management
      • Teams management
      • Server disk usage
      • Server trash bin
      • Server cleanup
      • Server stats and errors
    • Teams & workspaces
    • Members
    • Issues
    • Guides & exams
    • Activity log
    • Sharing
  • ๐Ÿ–ฅ๏ธAgents
    • Installation
      • Linux
      • Windows
      • AMI AWS
      • Kubernetes
    • How agents work
    • Restart and delete agents
    • Status and monitoring
    • Storage and cleanup
    • Integration with Docker
  • ๐Ÿ”ฎNeural Networks
    • Overview
    • Inference & Deployment
      • Overview
      • Supervisely Serving Apps
      • Deploy & Predict with Supervisely SDK
      • Using trained models outside of Supervisely
    • Model Evaluation Benchmark
      • Object Detection
      • Instance Segmentation
      • Semantic Segmentation
      • Custom Benchmark Integration
    • Custom Model Integration
      • Overview
      • Custom Inference
      • Custom Training
    • Legacy
      • Starting with Neural Networks
      • Train custom Neural Networks
      • Run pre-trained models
  • ๐Ÿ‘”Enterprise Edition
    • Get Supervisely
      • Installation
      • Post-installation
      • Upgrade
      • License Update
    • Kubernetes
      • Overview
      • Installation
      • Connect cluster
    • Advanced Tuning
      • HTTPS
      • Remote Storage
      • Single Sign-On (SSO)
      • CDN
      • Notifications
      • Moving Instance
      • Generating Troubleshoot Archive
      • Storage Cleanup
      • Private Apps
      • Data Folder
      • Firewall
      • HTTP Proxy
      • Offline usage
      • Multi-disk usage
      • Managed Postgres
      • Scalability Tuning
  • ๐Ÿ”งCustomization and Integration
    • Supervisely .JSON Format
      • Project Structure
      • Project Meta: Classes, Tags, Settings
      • Tags
      • Objects
      • Single-Image Annotation
      • Single-Video Annotation
      • Point Cloud Episodes
      • Volumes Annotation
    • Developer Portal
    • SDK
    • API
  • ๐Ÿ’กResources
    • Changelog
    • GitHub
    • Blog
    • Ecosystem
Powered by GitBook
On this page
  • Overview
  • Format description
  • Input files structure
  • Format Config File
  • Single-Image Annotation
  • Useful links

Was this helpful?

  1. ๐Ÿ”Import and Export
  2. Import
  3. Supported annotation formats
  4. Images

Yolo

PreviousCOCONextPascal VOC

Last updated 3 months ago

Was this helpful?

Overview

This converter allows to import images with annotations in YOLO format for segmentation, detection and pose estimation tasks.

Each image should have a corresponding .txt file with the same name, which contains information about objects in the image.

  • Segmentation labels will be converted to polygons. Labels format: <class-index> <x1> <y1> <x2> <y2> ... <xn> <yn>

  • Detection labels will be converted to rectangles. Labels format: <class-index> <x_center> <y_center> <width> <height>

  • Pose estimation labels will be converted to keypoints. Labels format: <class-index> <x> <y> <width> <height> <px1> <py1> <px2> <py2> ... <pxn> <pyn> for Dim=2 and <class-index> <x> <y> <width> <height> <px1> <py1> <p1-visibility> <px2> <py2> <p2-visibility> <pxn> <pyn> <p2-visibility> for Dim=3.

YOLO format data should have a specific configuration file that contains information about classes and datasets, usually named data_config.yaml.

โš ๏ธ Note: If the input data does not contain data_config.yaml file, it will use default COCO class names.

Enterprise users have access to "Import as links" option, which supports import of this format with annotations. This option might be beneficial in many cases, as it allows data import to Supervisely platform without re-uploading, maintaining a single source and speeding up import process.

To step up import speed even further you can compress all annotation files (.txt's) into an archive and import it together with the images. (Note: This method is format-dependent and may not apply to all formats.)

Default COCO class names
names:
  [
    "person",
    "bicycle",
    "car",
    "motorcycle",
    "airplane",
    "bus",
    "train",
    "truck",
    "boat",
    "traffic light",
    "fire hydrant",
    "stop sign",
    "parking meter",
    "bench",
    "bird",
    "cat",
    "dog",
    "horse",
    "sheep",
    "cow",
    "elephant",
    "bear",
    "zebra",
    "giraffe",
    "backpack",
    "umbrella",
    "handbag",
    "tie",
    "suitcase",
    "frisbee",
    "skis",
    "snowboard",
    "sports ball",
    "kite",
    "baseball bat",
    "baseball glove",
    "skateboard",
    "surfboard",
    "tennis racket",
    "bottle",
    "wine glass",
    "cup",
    "fork",
    "knife",
    "spoon",
    "bowl",
    "banana",
    "apple",
    "sandwich",
    "orange",
    "broccoli",
    "carrot",
    "hot dog",
    "pizza",
    "donut",
    "cake",
    "chair",
    "couch",
    "potted plant",
    "bed",
    "dining table",
    "toilet",
    "tv",
    "laptop",
    "mouse",
    "remote",
    "keyboard",
    "cell phone",
    "microwave",
    "oven",
    "toaster",
    "sink",
    "refrigerator",
    "book",
    "clock",
    "vase",
    "scissors",
    "teddy bear",
    "hair drier",
    "toothbrush",
  ]

Format description

Supported image formats: .jpg, .jpeg, .mpo, .bmp, .png, .webp, .tiff, .tif, .jfif, .avif, .heic, and .heif With annotations: Yes Supported annotation file extension: .txt. Grouped by: Any structure (will be uploaded as a single dataset)\

Input files structure

Example data: download โฌ‡๏ธ\

Recommended directory structure:

  ๐Ÿ“‚project name
   โ”ฃ ๐Ÿ“‚images
   โ”ƒ  โ”ฃ ๐Ÿ“‚train
   โ”ƒ  โ”ƒ  โ”ฃ ๐Ÿ–ผ๏ธIMG_0748.jpeg
   โ”ƒ  โ”ƒ  โ”ฃ ๐Ÿ–ผ๏ธIMG_1836.jpeg
   โ”ƒ  โ”ƒ  โ”ฃ ๐Ÿ–ผ๏ธIMG_2084.jpeg
   โ”ƒ  โ”ƒ  โ”— ๐Ÿ–ผ๏ธIMG_3861.jpeg
   โ”ƒ  โ”— ๐Ÿ“‚val
   โ”ƒ     โ”ฃ ๐Ÿ–ผ๏ธIMG_4451.jpeg
   โ”ƒ     โ”— ๐Ÿ–ผ๏ธIMG_8144.jpeg
   โ”ฃ ๐Ÿ“‚labels
   โ”ƒ  โ”ฃ ๐Ÿ“‚train
   โ”ƒ  โ”ƒ  โ”ฃ ๐Ÿ“œIMG_0748.txt
   โ”ƒ  โ”ƒ  โ”ฃ ๐Ÿ“œIMG_1836.txt
   โ”ƒ  โ”ƒ  โ”ฃ ๐Ÿ“œIMG_2084.txt
   โ”ƒ  โ”ƒ  โ”— ๐Ÿ“œIMG_3861.txt
   โ”ƒ  โ”— ๐Ÿ“‚val
   โ”ƒ     โ”ฃ ๐Ÿ“œIMG_4451.txt
   โ”ƒ     โ”— ๐Ÿ“œIMG_8144.txt
   โ”— ๐Ÿ“œdata_config.yaml

Format Config File

File data_config.yaml should contain the following keys:

  • names - a list of class names

  • colors - a list of class colors in RGB format

  • nc - the number of classes

  • train - the path to the train images

  • val - the path to the validation images

๐Ÿ“œdata_config.yaml
names: [kiwi, lemon] # class names
colors: [[255, 1, 1], [1, 255, 1]] # class colors
nc: 2 # number of classes
train: ../lemons/images/train # path to train imgs (or "images/train")
val: ../lemons/images/val # path to val imgs (or "images/val")

# Keypoints (for pose estimation)
kpt_shape: [17, 3] # number of keypoints, number of dims (2 for x,y or 3 for x,y,visible)

Single-Image Annotation

Annotation files are in .txt format and should contain object labels on each line:

  • Class numbers that correspond to the class names in the data_config.yaml file.

  • Label coordinates must be in normalized format (from 0 to 1).

1. Segmentation

Labels should be formatted with one row per object in:

<class-index> <x1> <y1> <x2> <y2> ... <xn> <yn>

2. Detection:

Labels should be formatted with one row per object in:

<class-index> <x_center> <y_center> <width> <height>

If your boxes are in pixels, you should divide x_center and width by image width, and y_center and height by image height.

3. Pose Estimation:

Labels should be formatted with one row per object.

For Dim=2:

<class-index> <x> <y> <width> <height> <px1> <py1> <px2> <py2> ... <pxn> <pyn>

For Dim=3:

<class-index> <x> <y> <width> <height> <px1> <py1> <p1-visibility> <px2> <py2> <p2-visibility> ... <pxn> <pyn> <pn-visibility>

Yolo coordinates explanation:

The label file corresponding to the below image contains 2 persons (class 0) and a tie (class 27) from original COCO classes.

๐Ÿ“œzidan.txt:

0 0.481719 0.634028 0.690625 0.713278
0 0.741094 0.524306 0.314750 0.933389
27 0.364844 0.795833 0.078125 0.400000

Useful links

  • YOLO format

  • [Supervisely Ecosystem] Convert YOLO v5 to Supervisely format

Result of the import
Yolo coordinates explanation