open-inference-spec

Feature

A feature is a column that captures the inputs to the model. Features are typically numeric or categorical values that are used to make the prediction but can extend more complex data types such as embeddings as well. For example, a feature could be the age of a user or their FICO score. Feature columns are prefixed with the category :feature: in the column name.

Single Value Features

If the column name is only prefixed with the category :feature: and a data_type with a unique name, then it is assumed that the column is a single value column (e.g. a key-value pair). Here is an example of a set of feature columns:

</tbody> </table> Note that the `:feature.int:fico_score` column has a value of `null`. This is because the user did not provide their FICO score. This is a valid value for a single value feature column. ## Embedding Features If a set of columns are meant to be grouped into a "composite", they MUST have matching names. For example, if there is a column named `:feature.vector:prompt` and `:feature.text:prompt`, then these two columns are meant to be grouped. It is this mechanism that can be used to associate data to a embedding vector. Here is an example of an embedding for a prompt:
:feature.int:age :feature.str:bank :feature.int:fico_score :feature.bool:is_gold_member
25 Chase 750 true
35 Wells Fargo 800 false
45 Bank of America null false
</tbody>
:feature.[float].embedding:prompt :feature.text:prompt :prediction.text:response
[0.1, 0.2, 0.3] What is the weather like today? It is rainy
[0.4, 0.5, 0.6] What is the weather like tomorrow? It is sunny
[0.7, 0.8, 0.9] What is the weather like in 5 days? It is cloudy
Here is an example of an embedding for an image:
:feature.[float].embedding:my_image :feature.url:my_image
[0.1, 0.2, 0.3] https://example.com/image1.jpg
[0.4, 0.5, 0.6] https://example.com/image2.jpg
[0.7, 0.8, 0.9] https://example.com/image3.jpg
### Retrieval Embeddings If an embedding feature is used to retrieve records from a knowledge base corpus, you can specify the retrieved document IDs as well as the associated scores. The data for an embedding that is used to retrieve knowledge base records would look something like this:
:feature.[float].embedding:prompt :feature.text:prompt :feature.[id].retrieved_document_ids:prompt :feature.[float].retrieved_document_scores:prompt :prediction.text:response
[0.1, 0.2, 0.3] What is the weather like today? ["doc_id_1", "doc_id_4", "doc_id_6"] [0.2, 0.5, 0.7] It is rainy
[0.4, 0.5, 0.6] What is the weather like tomorrow? ["doc_id_1", "doc_id_3", "doc_id_6"] [0.2, 0.4, 0.9] It is sunny
[0.7, 0.8, 0.9] What is the weather like in 5 days? ["doc_id_1", "doc_id_3", "doc_id_6"] [0.2, 0.4, 0.9] It is cloudy