open-inference-spec

Overview

The OpenInference specification defines a set of columns that semantically map to segments of a model’s inference. OpenInference defines a set of columns that capture production inference logs that can be used on top of many file formats.

Naming Convention

The column names in OpenInference encode semantics via a well-formed prefix, where a set of :s are used to encapsulate machine-parsable information. Parsers of the OpenInference specification should use the : as a delimiter to extract the ontological information about the column. The anatomy of a column name is as follows:

:<category>.<data_type>.<[identifier]>:<name>

Where category MUST be provided. The data_type and identifier MUST be provided depending on the category. The name is optional ONLY if the category is a reserved singleton category for the row (e.g. :id:).

In the specification, category, data_type, and identifier will be referred to as parts.

Between the :s, the parts are separated by a .. The following is an example of an integer column named age:

:feature.int:age

Categories

A single row or inference record is composed of a set of columns that capture the following information:

In the specification, the above information will be referred to as categories. The above categories are captured in the prefix-based naming convention as the first item. The following is a list of prefixes that are used to capture the above:

The above prefixes are used to capture the semantic category of the column. For example, a column named :feature.int:age would be a column that captures the age of the user and that is used as an input to the model. A column named :prediction.float.score: would be a column that captures the score of the prediction.

The features, predictions, actuals, and tags categories will be referred to in this specification as dimensions.

Data Types

OpenInference is designed to be transport and file format agnostic. As such, it relies on the underlying file format to define the primitive types. However not all file formats are created equal and a superset of data types are required to fully capture the data (For example, JSON has no concept of float). For this reason, we reserve the second part of the prefix for the data_type. The following is a list of data types that are supported by OpenInference:

List Types

The above data types can be used to define a list of values by wrapping the data type in []s. For example, a column that captures a list of ids can be defined as :feature.[id]:document_ids.

Specifiers

Specifiers designate a reserved semantic meaning to the column. Specifiers are used to capture specific reserved information about the column. The following is a list of specifiers that are supported by OpenInference:

For full details on each of the columns, consult the sub-sections below.

Columns