# SemanticKITTI

## What is the SemanticKITTI Format?

{% hint style="success" %}
Easily import your **LiDAR point cloud sequences** and **3D semantic annotations** using the SemanticKITTI format into Supervisely.
{% endhint %}

The **SemanticKITTI** format is a widely used standard designed for semantic scene understanding and **autonomous driving** applications. It provides dense point-wise annotations for 3D point cloud episodes, enabling advanced machine learning tasks like **semantic segmentation**, **instance segmentation**, panoptic segmentation, and 3D scene completion. The format supports processing continuous tracking sequences with diverse semantic classes, covering vehicles, pedestrians, buildings, vegetation, road surfaces, and other urban environment objects.

## Input Files Structure

{% hint style="success" %}
[Download sample dataset](https://github.com/supervisely-ecosystem/demo-semantic-kitti-pointcloud-episodes-annotated/releases/download/v1.0.0/project-example.zip) in SemanticKITTI format (15 MB)
{% endhint %}

**Format directory structure:**

```
📦SemanticKITTI
├──📂sequences
│   ├──📂00
│   │   ├──📂velodyne
│   │   │   ├──📄000000.bin
│   │   │   ├──📄000001.bin
│   │   │   ├──📄000002.bin
│   │   │   └──📄...
│   │   ├──📂labels
│   │   │   ├──📄000000.label
│   │   │   ├──📄000001.label
│   │   │   ├──📄000002.label
│   │   │   └──📄...
│   │   ├──📄calib.txt
│   │   ├──📄poses.txt
│   │   └──📄times.txt
│   ├──📂01
│   │   ├──📂velodyne
│   │   │   └──📄...
│   │   ├──📂labels
│   │   │   └──📄...
│   │   ├──📄calib.txt
│   │   ├──📄poses.txt
│   │   └──📄times.txt
│   └──📂...
```

**The SemanticKITTI structure is organized as follows:**

* `sequences/` - contains numbered sequence folders
  * `XX/` - sequence folder (e.g., 00, 01, 02...)
    * `velodyne/` - contains LiDAR point cloud files in binary format
    * `labels/` - contains semantic and instance labels for each scan
    * `calib.txt` - calibration file containing projection matrices
    * `poses.txt` - camera poses for each scan
    * `times.txt` - timestamps for each scan

## SemanticKITTI Annotation format

### Point Cloud Files

Filename: `NNNNNN.bin`

Point cloud files are stored in binary format with `.bin` extension in `velodyne` folder. Each file contains a list of 3D points with intensity values.

**Format:** Each point is represented by 4 float32 values:

* `x` - X coordinate (float32)
* `y` - Y coordinate (float32)
* `z` - Z coordinate (float32)
* `intensity` - Reflectance value (float32)

### Label Files

Filename: `NNNNNN.label`

The label files are stored in binary format with the `.label` extension in `labels` folder. Each label file corresponds to a single point cloud scan and contains semantic and instance annotations for each point.

**Format:** Each label is a 32-bit unsigned integer (`uint32_t`) encoding both semantic class and instance ID:

* **Lower 16 bits** - semantic label (class ID)
* **Upper 16 bits** - instance ID (temporally consistent across the sequence)

The instance IDs are consistent over the whole sequence, meaning the same object in different scans gets the same ID. This applies to both moving and static objects.

### Calibration File

Filename: `calib.txt`

The calibration file contains projection matrices for transforming between coordinate systems. It includes:

* `P0`, `P1`, `P2`, `P3` - Camera projection matrices (3x4)
* `Tr` - Transformation matrix from Velodyne to camera coordinates (3x4 or 4x4)

### Poses File

Filename: `poses.txt`

The poses file contains the camera pose (transformation from camera coordinates to world coordinates) for each scan in the sequence. Each line represents a pose as a 3x4 transformation matrix (flattened to 12 values).

**Format:** Each line contains 12 float values representing the first three rows of a 4x4 transformation matrix (the last row is \[0, 0, 0, 1]).

### Times File

Filename: `times.txt`

The times file contains timestamps for each scan in the sequence. Each line contains a single float value representing the timestamp in seconds.

## Export to SemanticKITTI Format

You can export your labeled point cloud episodes data to SemanticKITTI format using the [Export to SemanticKITTI](https://ecosystem.supervisely.com/apps/export-to-semantic-kitti) application from the Supervisely Ecosystem.

## License

The SemanticKITTI dataset is distributed under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/) license. You are free to share and adapt the data, but you must give appropriate credit and may not use the work for commercial purposes.

When using the SemanticKITTI dataset, please cite:

```bibtex
@inproceedings{behley2019iccv,
  author = {J. Behley and M. Garbade and A. Milioto and J. Quenzel and S. Behnke and C. Stachniss and J. Gall},
  title = {{SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences}},
  booktitle = {Proc. of the IEEE/CVF International Conf.~on Computer Vision (ICCV)},
  year = {2019}
}
```

And the original KITTI Vision Benchmark:

```bibtex
@inproceedings{geiger2012cvpr,
  author = {A. Geiger and P. Lenz and R. Urtasun},
  title = {{Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite}},
  booktitle = {Proc.~of the IEEE Conf.~on Computer Vision and Pattern Recognition (CVPR)},
  pages = {3354--3361},
  year = {2012}
}
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.supervisely.com/import-and-export/import/supported-annotation-formats/pointcloud_episodes/semantic-kitti.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.