Documentation

NOTE: The public zympy module is scheduled for initial release August 1, 2025

Installation

Linux

With Anaconda

We recomend using Anaconda or some other virtual environment manager to handle dependencies. To install Conda - see here.

After installing conda, create and activate an environment using:

conda create -n <name of your environment> python=3.11
conda activate <name of your environment>

With your environment active [you'll see (<name of your environment>) at the left of your terminal], run the pip command:

pip install zympy

Getting Started

Directory Structure

Zympy datasets have three primary folders; images - labels - meta. Each instance data is defined using universal unique identifier values (UUID), in our case these are 8 character values of mixed integer-string characters, e.g. aA76li11-u163t8F0. Every instance will be identified by the uuid defining the dataset it was generated with (the first 8 characters), followed by the individual instance uuid value (the final 8 characters), sperated by a ' - ' (dash).

1a36ecdd <-- The Dataset UUID
├── images
│   ├── 1a36ecdd-3df894c1.png <-- The Instance UUID
│   ├── 1a36ecdd-98dddb4c.png
│   └── 1a36ecdd-381a6aac.png
├── labels
│   ├── bounding_box
│   │   ├── 1a36ecdd-3df894c1
│   │   │    ├── 2D
│   │   │    │   └── data.json
│   │   │    └── 3D
│   │   │        └── data.json
│   │   ├── 1a36ecdd-98dddb4c
│   │   └── 1a36ecdd-381a6aac
│   ├── contour
│   │   ├── 1a36ecdd-3df894c1
│   │   │   └── data.json
│   │   ├── 1a36ecdd-98dddb4c
│   │   └── 1a36ecdd-381a6aac
│   ├── pose
│   │   ├── 1a36ecdd-3df894c1
│   │   │   └── data.json
│   │   ├── 1a36ecdd-98dddb4c
│   │   └── 1a36ecdd-381a6aac
│   └── segmentation
│       ├── 1a36ecdd-3df894c1
│       │   ├── 1a36ecdd-3df894c1.png
│       │   └── data.json
│       ├── 1a36ecdd-98dddb4c
│       └── 1a36ecdd-381a6aac
└── meta
    ├── 1a36ecdd-3df894c1
    │   └── data.json
    ├── 1a36ecdd-98dddb4c
    └── 1a36ecdd-381a6aac

Zympy API

The public python module has several sub-modules available to help you get to training as fast as possible. These are organized by:

zympy.zympy_io

Contains helper functions to load data into memory, i.e:

  • Retrieve all the instance names contained in a dataset
  • Load dataset-level or instance-level meta data
  • Load images by instance name
  • Load labels by instance name

zympy.mask

Contains helper functions to create masks from the labels, each of which are composable, i.e:

  • Bounding box masks
  • Contour masks
  • Segmentation Masks
  • Pose Masks

zympy.filter

Contains helper functions to filter the dataset for instances that meet some criteria. This is targeted towards enabling curriculum learning in vision model training, for example you may filter a dataset by % occlusion of a certain object of interest - exposing the network to intances with low or no occlusion early on, and gradually increase the difficulty over multiple epochs. i.e Filter by:

  • Object UUID presence within the instance
  • Object position or orientation
  • Camera position or orientation
  • Total lighting energy in the image
  • Object occlusion %
  • ...

zympy.format

Contains helper functions to convert zympy datasets into common external formats, i.e:

  • Convert the dataset to YOLO conventions (v5, v8, v11)

zympy.analyze

Contains helper functions to compute statistics about the dataset, i.e:

  • Object pose distributions
  • Camera pose distributions

zympy.zympy_io

get_instance_names

Get a list of all instance names in a dataset directory. Returns validated and invalid instance name lists.

get_instance_names(dataset_path: str, verbose: bool = True) -> Tuple[List[str], List[str]]

Parameters

Returns

Example

from zympy.zympy_io import get_instance_names
validated, invalid = get_instance_names('/path/to/dataset', verbose=True)
print(validated)
print(invalid)

load_dataset_meta

Load and validate a DataSetMeta pydantic model from the dataset directory.

load_dataset_meta(dataset_path: str) -> DataSetMeta

Parameters

Returns

Example

from zympy.zympy_io import load_dataset_meta
dataset_meta = load_dataset_meta('/path/to/dataset')
print(dir(dataset_meta)) # print out all attributes and methods

load_instance_meta

Load the metadata JSON for a specific instance or instances from a dataset into a validated pydantic model.

load_instance_meta(instance_name: List[str], dataset_path: str) -> List[InstanceMeta]

Parameters

Returns

Example

from zympy.zympy_io import load_instance_meta
instances = load_instance_meta(['abcd1234-efgh5678'], '/path/to/dataset')
print(dir(instances)) # print out all attributes and methods

load_instance_image

Load an image for a given instance from the dataset.

load_instance_image(instance_name: str, dataset_path: str) -> np.ndarray

Parameters

Returns

Example

from zympy.zympy_io import load_instance_image
image = load_instance_image('abcd1234-efgh5678', '/path/to/dataset')
print(image.shape)

load_instance_labels

Load all labels associated with the instance from the dataset.

load_instance_labels(instance_name: str, dataset_path: str) -> Dict

Parameters

Returns

Example

from zympy.zympy_io import load_instance_labels
labels = load_instance_labels('abcd1234-efgh5678', '/path/to/dataset')
print(labels)

load_bounding_box_label

Load the bounding box label for a specific instance from the dataset, supporting 2D or 3D boxes.

load_bounding_box_label(instance_name: str, dataset_path: str, box_type: Literal['2D','3D']='2D') -> Dict

Parameters

Returns

Example

from zympy.zympy_io import load_bounding_box_label
bbox = load_bounding_box_label('abcd1234-efgh5678', '/path/to/dataset', box_type='2D')
print(bbox)

load_contour_label

Load the contour label for a specific instance from the dataset.

load_contour_label(instance_name: str, dataset_path: str) -> Dict

Parameters

Returns

Example

from zympy.zympy_io import load_contour_label
contour_label = load_contour_label('abcd1234-efgh5678', '/path/to/dataset')
print(contour_label)

load_pose_label

Load the pose label for a specific instance from a dataset. The first vector in the pose label is the traslation vector to the origin of the object, the second vector is the quaternion describing its orientation

load_pose_label(instance_name: str, dataset_path: str) -> Dict

Parameters

Returns

Example

from zympy.zympy_io import load_pose_label
pose_label = load_pose_label('abcd1234-efgh5678', '/path/to/dataset')
print(pose_label)

load_segmentation_label

Load the segmentation label for a specific instance from a dataset.

load_segmentation_label(instance_name: str, dataset_path: str) -> Tuple[Dict, np.ndarray]

Parameters

Returns

Example

from zympy.zympy_io import load_segmentation_label
import cv2
seg_map, seg_array = load_segmentation_label('abcd1234-efgh5678', '/path/to/dataset')
print(seg_map)
cv2.imshow('segmentation', seg_array)
cv2.waitKey(0)

zympy.mask

create_empty_rgba

Create an RGBA array populated with `(0, 0, 0, 0)`. Uses a reference image to match its width and height, or specify dimensions manually.

create_empty_rgba(reference_image: np.ndarray = None, mask_dimensions: Tuple = None) -> np.ndarray

Parameters

Returns

Example

from zympy.mask import create_empty_rgba
mask = create_empty_rgba(reference_image=image)
print(mask.shape)

create_bounding_box_2D_mask

Construct a transparent mask with bounding boxes annotated.

create_bounding_box_2D_mask(bounding_boxes: Dict, mask_dimensions: tuple[int, int], active_uuids: set[str] = None, line_thickness: int = DEFAULT_LINE_THICKNESS, color: Tuple = None, color_seed: int = COLOR_SEED) -> np.ndarray

Parameters

Returns

Example

from zympy.mask import create_bounding_box_2D_mask
mask = create_bounding_box_2D_mask(bounding_boxes, mask_dimensions=(256, 256))

draw_bounding_box_2D

Draw bounding boxes directly onto an image using normalized coordinates.

draw_bounding_box_2D(image: np.ndarray, bbox_normalized_corners: List[Tuple[float, float, float, float]], color: tuple[int, int, int, int] = None, line_thickness: int = DEFAULT_LINE_THICKNESS) -> np.ndarray

Parameters

Returns

Example

# Draw a box with a random color

from zympy.mask import draw_bounding_box_2D, create_empty_rgba
import cv2
image = create_empty_rgba(mask_dimensions=(400, 400))
drawn = draw_bounding_box_2D(image, [(0.1, 0.1, 0.4, 0.4)])
cv2.imshow('bounding box', drawn)
cv2.waitKey(0)

zympy.filter

zympy.format

to_yolo

Converts a ZymPy-format dataset to the directory and label structure required by YOLO (v5, v8, v11) for object detection training. Partitions image instances into training/validation sets, generates label files, builds the YOLO directory structure, and writes a `data.yaml` configuration mapping class indices to UUID-style names.

to_yolo(yolo_version: Literal[5, 8, 11], zympy_dataset_path: str, yolo_dataset_path: str, yolo_dataset_name: str = None, instance_names: List[str] = None, train_val_split: float = 0.8, training_instance_names: List[str] = None, validation_instance_names: List[str] = None, label_type: Literal['bounding_box', 'segmentation'] = 'bounding_box', class_list: List[str] = None, suppress_memory_prompts: bool = False) -> bool

Parameters

Returns

Example

from zympy.format import to_yolo

success = to_yolo(
    yolo_version=5,
    zympy_dataset_path='/data/zympy_dataset',
    yolo_dataset_path='/data/yolo_dataset',
    train_val_split=0.8
)
print(success)  # True if successful

Take your computer vision capabilities to new heights