Supervisely
AboutAPI ReferenceSDK Reference
  • 🤖What's Supervisely
  • 🚀Ecosystem of Supervisely Apps
  • 💡FAQ
  • 📌Getting started
    • How to import
    • How to annotate
    • How to invite team members
    • How to connect agents
    • How to train models
  • 🔁Import and Export
    • Import
      • Overview
      • Import using Web UI
      • Supported annotation formats
        • Images
          • 🤖Supervisely JSON
          • 🤖Supervisely Blob
          • COCO
          • Yolo
          • Pascal VOC
          • Cityscapes
          • Images with PNG masks
          • Links from CSV, TXT and TSV
          • PDF files to images
          • Multiview images
          • Multispectral images
          • Medical 2D images
          • LabelMe
          • LabelStudio
          • Fisheye
          • High Color Depth
        • Videos
          • Supervisely
        • Pointclouds
          • Supervisely
          • .PCD, .PLY, .LAS, .LAZ pointclouds
          • Lyft
          • nuScenes
          • KITTI 3D
        • Pointcloud Episodes
          • Supervisely
          • .PCD, .PLY, .LAS, .LAZ pointclouds
          • Lyft
          • nuScenes
          • KITTI 360
        • Volumes
          • Supervisely
          • .NRRD, .DCM volumes
          • NIfTI
      • Import sample dataset
      • Import into an existing dataset
      • Import using Team Files
      • Import from Cloud
      • Import using API & SDK
      • Import using agent
    • Migrations
      • Roboflow to Supervisely
      • Labelbox to Supervisely
      • V7 to Supervisely
      • CVAT to Supervisely
    • Export
  • 📂Data Organization
    • Core concepts
    • MLOps Workflow
    • Projects
      • Datasets
      • Definitions
      • Collections
    • Team Files
    • Disk usage & Cleanup
    • Quality Assurance & Statistics
      • Practical applications of statistics
    • Operations with Data
      • Data Filtration
        • How to use advanced filters
      • Pipelines
      • Augmentations
      • Splitting data
      • Converting data
        • Convert to COCO
        • Convert to YOLO
        • Convert to Pascal VOC
    • Data Commander
      • Clone Project Meta
  • 📝Labeling
    • Labeling Toolboxes
      • Images
      • Videos 2.0
      • Videos 3.0
      • 3D Point Clouds
      • DICOM
      • Multiview images
      • Fisheye
    • Labeling Tools
      • Navigation & Selection Tools
      • Point Tool
      • Bounding Box (Rectangle) Tool
      • Polyline Tool
      • Polygon Tool
      • Brush Tool
      • Mask Pen Tool
      • Smart Tool
      • Graph (Keypoints) Tool
      • Frame-based tagging
    • Labeling Jobs
      • Labeling Queues
      • Labeling Consensus
      • Labeling Statistics
    • Labeling with AI-Assistance
  • 🤝Collaboration
    • Admin panel
      • Users management
      • Teams management
      • Server disk usage
      • Server trash bin
      • Server cleanup
      • Server stats and errors
    • Teams & workspaces
    • Members
    • Issues
    • Guides & exams
    • Activity log
    • Sharing
  • 🖥️Agents
    • Installation
      • Linux
      • Windows
      • AMI AWS
      • Kubernetes
    • How agents work
    • Restart and delete agents
    • Status and monitoring
    • Storage and cleanup
    • Integration with Docker
  • 🔮Neural Networks
    • Overview
    • Inference & Deployment
      • Overview
      • Supervisely Serving Apps
      • Deploy & Predict with Supervisely SDK
      • Using trained models outside of Supervisely
    • Model Evaluation Benchmark
      • Object Detection
      • Instance Segmentation
      • Semantic Segmentation
      • Custom Benchmark Integration
    • Custom Model Integration
      • Overview
      • Custom Inference
      • Custom Training
    • Legacy
      • Starting with Neural Networks
      • Train custom Neural Networks
      • Run pre-trained models
  • 👔Enterprise Edition
    • Get Supervisely
      • Installation
      • Post-installation
      • Upgrade
      • License Update
    • Kubernetes
      • Overview
      • Installation
      • Connect cluster
    • Advanced Tuning
      • HTTPS
      • Remote Storage
      • Single Sign-On (SSO)
      • CDN
      • Notifications
      • Moving Instance
      • Generating Troubleshoot Archive
      • Storage Cleanup
      • Private Apps
      • Data Folder
      • Firewall
      • HTTP Proxy
      • Offline usage
      • Multi-disk usage
      • Managed Postgres
      • Scalability Tuning
  • 🔧Customization and Integration
    • Supervisely .JSON Format
      • Project Structure
      • Project Meta: Classes, Tags, Settings
      • Tags
      • Objects
      • Single-Image Annotation
      • Single-Video Annotation
      • Point Cloud Episodes
      • Volumes Annotation
    • Developer Portal
    • SDK
    • API
  • 💡Resources
    • Changelog
    • GitHub
    • Blog
    • Ecosystem
Powered by GitBook
On this page
  • db
  • logs
  • proxy_cache
  • rabbitmq
  • redis & redis-json
  • storage

Was this helpful?

  1. Enterprise Edition
  2. Advanced Tuning

Data Folder

Your installation of Supervisely platform uses the DATA_PATH value to configure where to store its persistent data. By default, this value is set to /supervisely/data. This guide explains what kind of data can be found inside this folder, requirements and the cleanup.

Folder
Avg Size
Fast drive
Can be safely cleaned

db

2-10Gb+

required

No

logs

10Mb - 4Gb+

not required

Yes

net-server

1Mb

not required

Almost

proxy_cache

100Mb - 10Gb+

preferable

Yes

rabbitmq

100Mb - 2Gb

preferable

Almost

redis

10Mb - 2Gb

not required

Almost

redis-json

10Mb - 1Gb

not required

Almost

storage

10Gb - 100Gb+

not required

No

.
├── db
├── logs
├── net-server
├── proxy_cache
├── rabbitmq
├── redis
├── redis-json
└── storage

Never set DATA_PATH pointing to a network share (NFS/SMB/ESB/etc), because it affects the performance significantly. Instead, you should just symlink every folder that doesn't require a fast drive to a network share. In most cases it's just the "storage" folder.

db

This subfolder is used by PostgreSQL relational database. This is the primary database Supervisely uses to store your annotations, users, dataset structures, and so on. Contents of this folder are shared with postgres Docker container. The size of the database usually does not exceed 10 Gb.

It's advised to store this folder on a fast SSD drive. If you store it on a slow HDD drive, you may experience performance issues.

This database does not store your actual images or videos, only URLs or file hashes.

Fast drive: required for the best performance Can be safely cleaned: No, you will lose all your annotations and projects.

logs

By default we do not clean this folder automatically.

Fast drive: optional, doesn't affect the performance Can be safely cleaned: Yes

proxy_cache

This subfolder is used by Nginx to cache certain resources for fast access of frequently used assets, mainly small previews of images and video frames. The size of this folder can be configured via CACHE_STORAGE_SIZE setting.

Fast drive: preferred, but not required Can be safely cleaned: Yes

rabbitmq

This subfolder is used by RabbitMQ message broker. This is a temporary storage to queue tasks. If you clean this folder, running tasks will be stopped an may end up in an invalid state

Fast drive: preferred, but not required Can be safely cleaned: Almost

redis & redis-json

This subfolder is used by Redis cache database. This is a storage for temporary data that is also available in the main database (PostgreSQL), but is duplicated for fast access. For example, users' online status is cached there. If you clean this folder, some minor information such as real-time logs can be lost

Fast drive: optional, doesn't affect the performance Can be safely cleaned: Almost

storage

This subfolder is used by various services to store permanent files, such as images and other assets.

Some of the examples:

  • Images

  • Videos

  • Point cloud files

  • Model checkpoints

  • Application posters

  • Jupyter notebooks

  • Task data

Usually, we generate a unique file name or use file hash instead of the original file name.

You will find two subfolders, *-public and *-private inside this folder. Those names do not reflect the actual privacy of folder contents; both folders are completely private and not publicly accessible; those names are legacy.

Fast drive: completely optional, required in very rare cases Can be safely cleaned: No, you will lose all your images, videos, and other assets.

PreviousPrivate AppsNextFirewall

Last updated 1 year ago

Was this helpful?

This subfolder is used by logs parsing and transforming service (vector Docker container). Vector dumps the logs into the logs subfolder in Zstandard JSON lines format. Logs can be easily obtained by running the sudo supervisely troubleshoot command.

You can configure Supervisely to , such as S3, instead of this folder. In this case this folder will be empty, and actual files will be stored as blob objects in the selected cloud.

👔
Vector
use remote storage