Annotation format¶
The precomputed annotation format defines an annotation collection for a given n-dimensional coordinate space and one of the following four geometry types:
Points (represented by a single position)
Line segments (represented by the two endpoint positions)
Axis-aligned bounding boxes (represented by two positions)
Axis-aligned ellipsoids (represented by a center position and radii vector)
All annotations within the annotation collection have the same geometry type.
Each annotation is defined by:
A unique uint64 annotation id;
Position/radii vectors required by the annotation type;
For each of the
relationships
specified in the metadata, a list of associated uint64 ids (typically corresponding to segmented objects);Values for each of the
properties
specified in the metadata.
An annotation collection is represented as a directory tree consisting of the following:
info
file in JSON format specifying themetadata
;a sub-directory containing the annotations indexed by their unique uint64 annotation ids;
for each relationship, a sub-directory containing the annotations indexed by associated object ids;
a set of sub-directories containing a multi-level spatial index of the annotations.
info
metadata file¶
The info
file is a JSON-format text file with the following schema:
- json PrecomputedAnnotation : object¶
Precomputed annotation metadata
- Optional members:¶
- dimensions : object¶
Coordinate space over which annotations are defined.
The number of dimensions is called the rank.
- lower_bound : array of number¶
Lower bound (in the coordinate space given by
dimensions
).Length must match number of
dimensions
. This is also the origin of the grid used for each spatial index level.
- upper_bound : array of number¶
Upper bound (in the coordinate space given by
dimensions
).Length must match number of
dimensions
. All annotation geometry should be contained with the bounding box defined bylower_bound
andupper_bound
.
-
annotation_type :
"point"
|"line"
|"axis_aligned_bounding_box"
|"ellipsoid"
¶ Annotation geometry type.
- properties : array of object¶
Additional properties associated with each annotation.
- Required members:¶
- id : string¶
Unique identifier for the property.
Must match the regular expression
^[a-z][a-zA-Z0-9_]*$
.
- Optional members:¶
- description : string¶
Textual description to show in the UI.
- enum_values : array of number¶
Numeric values corresponding to the labels in
enum_labels
.Only valid if
type
is a numeric type (not"rgb"
or"rgba"
).
- enum_labels : array of string¶
Labels corresponding to the values in
enum_values
.Must be specified if, and only if,
enum_values
is specified. If specified, the length must match that ofenum_values
.
- relationships : array of object¶
Related object indices.
For each specified relationship (which usually corresponds to a specific segmentation volume), there is an associated set of uint64 identifiers for each annotation, and the corresponding related object index can be used to query, for a given relationship, the list of annotations that are associated with a given uint64 identifier.
- Required members:¶
- id : string¶
Unique identifier for the relationship (displayed in the UI).
- key : string¶
Relative path to the directory containing the related object index for this relationship.
- Optional members:¶
- sharding : PrecomputedSharding¶
Sharding parameters.
If specified, indicates that the related object index is stored in sharded format.
- by_id : object¶
Parameters of the annotation id index.
- Required members:¶
- key : string¶
Relative path to the annotation id index.
- Optional members:¶
- sharding : PrecomputedSharding¶
Sharding parameters.
If specified, indicates that the annotation id index is stored in sharded format.
- spatial : array of object¶
Spatial index levels, from coarse to fine.
- Required members:¶
- key : string¶
Relative path to the spatial index level.
- grid_shape : array of integer¶
Number of cells along each grid dimension for this spatial index level.
The length must match the number of
dimensions
.
- chunk_shape : array of integer¶
Number of cells along each grid dimension for this spatial index level.
The length must match the number of
dimensions
.
-
limit : integer[
1
, +∞)¶ Maximum number of annotations per grid cell in this level of the spatial index.
- Optional members:¶
- chunk_size : array of number¶
Size along each dimension of each grid cell (in the coordinate space given by
dimensions
).The length must match the number of
dimensions
.
- sharding : PrecomputedSharding¶
Sharding parameters.
If specified, indicates that the spatial index level is stored in sharded format.
- segment_properties : string¶
Relative path to the directory containing associated segment properties.
Note
This association does not apply transitively when this skeleton dataset itself is referenced via the precomputed volume
mesh
metadata property. Instead, the associated segment properties must be specified directly in the volume metadata.
Annotation id index¶
The annotation id index supports efficient retrieval of individual annotations by their uint64 id, and is used by Neuroglancer when selecting or hovering over an annotation.
The annotation id index maps each uint64 annotation id to the [encoded
representation](#single-annotation-encoding) of the single corresponding
annotation. Depending on whether
sharding
parameters are specified,
the index is stored either in the unsharded uint64 index
format or the
sharded uint64 index
format.
Note that the geometry and property data is duplicated in all indices, but only the annotation id index encodes the complete lists of related object ids.
Single annotation encoding¶
Within the annotation id index, each annotation is encoded in the following binary format:
The position/radii vectors required by the
annotation_type
encoded as float32le values:For
"point"
type, the position vector.For
"line"
type, the first endpoint position followed by the second endpoint position.For
"axis_aligned_bounding_box"
type, the first position followed by the second position.For
"ellipsoid"
type, the center position followed by the radii vector.
For each property of type
"uint32"
,"int32"
, or"float32"
: the value encoded as a little endian value.For each property of type
"uint16"
or"int16"
: the value encoded as a little endian value.For each property of type
"uint8"
,"int8"
,"rgb"
, or"rgba"
: the encoded value.Up to 3 padding bytes (with value of 0) to reach a byte offset that is a multiple of 4.
For each of the
relationships
specified in theinfo
metadata file:The number of object ids as a uint32le value.
Each related object id, as a uint64le value.
Unsharded uint64 index¶
The data corresponding to each uint64 annotation id or related object id is
stored in a file named <id>
within the directory indicated by the
"key"
member, where <id>
is the base-10 string representation
of the uint64 id.
Sharded uint64 index¶
The uint64 annotation id or related object id is used directly as the key within
the sharded representation within the directory indicated by the "key"
member.
Related object id index¶
The related object id index supports efficient retrieval of the list of annotations associated via a given relationship with a given object id, and is used by Neuroglancer when filtering by segment ids.
The related object id index maps each uint64 object id to the encoded
representation of
the list of related annotations. Depending on whether
sharding
parameters are
specified, the index is stored either in the unsharded uint64 index
format or the
sharded uint64 index
format.
Multiple annotation encoding¶
Both the related object id index and the spatial index encode lists of annotations in the following binary format:
The number of annotations,
count
, as a uint64le value.Repeated for
i = 0
up tocount - 1
:The position/radii vectors, the property values, and padding bytes of the
i
th annotation are encoded exactly as in the single annotation encoding.
Repeated for
i = 0
up tocount - 1
:The annotation id of the
i
th annotation encoded as a uint64le value.
For the related object id index, the order of the annotations does not matter. For the spatial index, the annotations should be ordered randomly.
Spatial index¶
The spatial index supports efficient retrieval of the set of annotations that intersects a given bounding box, with optional subsampling down to a desired maximum density.
The spatial index is used by Neuroglancer when not filtering by related segment ids.
Each spatial index level maps cell positions within the grid specified by the
chunk_size
and
grid_shape
to a spatially uniform
subsample of annotations intersecting that grid cell.
A grid cell with coordinates
cell
corresponds to a spatial interval in dimensiond
of[lower_bound[d] + cell[d] * chunk_size[d], lower_bound[d] + (cell[d] + 1) * chunk_size[d]]
The
chunk_size
for spatial index leveli+1
should evenly divide thechunk_size
for spatial index leveli
. The grid cells within leveli+1
that are contained within a single leveli
grid cell are considered the child cells. For each level, the elementwise product of thegrid_shape
and thechunk_size
should equalupper_bound - lower_bound
.Typically the
grid_shape
for level 0 should be a vector of all 1 (withchunk_size
equal toupper_bound - lower_bound
), and each component ofchunk_size
of each successively level should be either equal to, or half of, the corresponding component of the prior levelchunk_size
, whichever results in a more spatially isotropic chunk.
The spatial index levels should be computed as follows:
For each grid position
cell
at the coarsest level, compute the setremaining_annotations(0, cell)
of annotations that intersect the cell. Note that a single annotation may intersect multiple cells.Sequentially generate spatial index
level
, starting atlevel=0
(the coarsest level):Define
maxCount(level)
to be the maximum over allcell
positions of the size ofremaining_annotations(level, cell)
.For each
cell
:Compute a subset
emitted(level, cell)
ofremaining_annotations(0, cell)
where each annotation is chosen uniformly at random with probabilitymin(1, limit / maxCount(level))
.This spatial index level maps
cell
to the list of annotations inemitted(level, cell)
. The annotations are encoded in the multiple annotation encoding also used by the related object id index; the list should be ordered randomly (or perhaps pseudo-randomly based on the annotation id).For each
child_cell
in levellevel+1
contained withincell
: Compute the setremaining_annotations(level+1, child_cell)
of annotations withinremaining_annotations(level, cell) - emitted(level, cell)
that intersectchild_cell
.
Continue generating successively finer spatial index levels until no annotations remain.
Unsharded spatial index¶
The encoded annotation list corresponding to a grid cell cell
is stored
within the directory indicated by the
key
member in a file named
cell.join('_')
, i.e. the base-10 string representations of the grid cell
coordinates separated by the '_'
character. For example, cell (1, 2, 3)
is stored in the file named 1_2_3
.
Sharded spatial index¶
The compressed Morton code of the
grid cell is used as the key within the sharded representation stored in the
directory indicated by the key
member.