Supervisely Blob
Overview
When dealing with large quantities of small images (e.g., thousands of images under 100KB each), importing them individually is inefficient. The blob approach combines multiple images into a single archive file, making transfer and storage more efficient.
Annotations with Blob format
The key advantage of the blob format is that it optimizes storage and transfer of image data without changing how annotations work. When using blob format:
Annotations remain in the standard Supervisely JSON format exactly as described in the Supervisely JSON documentation
Each annotation file still corresponds to a specific image by name
All annotation features (rectangles, polygons, masks, etc.) work exactly the same way
The only difference is how the image data itself is stored and accessed
What is a Blob File?
A blob file in Supervisely is essentially a .tar
archive that contains multiple images bundled together. Instead of storing and transferring each image as a separate file, these images are packed into a single large file (the blob).
This approach:
Reduces the number of network requests needed for transfers
Minimizes filesystem overhead when dealing with many small files
What is an Offset File?
An offset file .pkl
is a companion file to the blob archive that contains metadata about where each image is located within the blob file.
Specifically:
It maps each image filename to its exact byte position (start and end offsets) in the blob file
Allows direct extraction of specific images without scanning the entire archive
Stored as a Python pickle file containing batches of dictionaries with image names as keys and offset positions as values
These two files work together to provide efficient storage and random access to large collections of small images.
Benefits include:
Faster import and export speeds
Reduced server load
More efficient storage on disk
Offset Representation
The BlobImageInfo
class of Supervisely Python SDK represents image metadata within a blob storage file. It contains information about where the image data is located in the blob file, defined by byte offsets. This class provides methods to manipulate and convert blob image information to formats suitable for storage and API interactions.
Once blob files are uploaded to Team Files, you can reuse them for multiple projects without re-uploading the images.
This approach helps optimize the import process for multiple projects since you don't need to re-upload the original images each time. By simply creating and uploading different offset files, you can import different subsets of images from the same blob archive.
Recommended Project Structure
A typical blob-based project structure looks like this:
For detailed information about blob project structure, refer to the extended Project Structure documentation.
Performance Comparison Information
A blob project with 30000 small images (~4KB each) can be:
Uploaded
~2x
faster than standard uploads,~x14
especially using fast methodscoming soon in apps
Downloaded
~4x
faster than standard downloads,~22x
especially using fast methods
Useful links
Last updated
Was this helpful?