Datasets

Dataset declarations allow specifications to reference external data at scale, without having to replicate it manually in the specification.

Basics

Datasets are declared as follows using the @dataset annotation:

@dataset
myDataset : Tensor Rat [28, 28]

Datasets can be any type t that can be constructed from the following grammar:

t ::= List t | Vector t n | Tensor t ns | s
s ::= Index n | Nat | Int | Rat

where n is a known constant and ns is a list of known constants.

Once declared, datasets can be used as any other named List, Vector or Tensor would be, e.g.

forall x in myDataset . robustAround x

Supported formats

The actual implementation of the dataset is provided when using the specification during training or verification.

At the moment only the IDX format is supported. A full description of this format can be found at the bottom of this page.

There are numerous libraries for converting datasets into this format:

If you would be interested in other formats being supported, please get in touch.