openinference

Embedding Spans

Embedding spans capture operations that convert text or token IDs into dense float vectors for semantic search, clustering, and similarity comparison.

Span Name

The span name MUST be "CreateEmbeddings" for embedding operations.

Required Attributes

All embedding spans MUST include:

Common Attributes

Embedding spans typically include:

Text Attributes

The embedding.embeddings.N.embedding.text attributes are populated ONLY when the input is already text (strings). These attributes are recorded during the request phase to ensure availability even on errors.

Token IDs (pre-tokenized integer arrays) are NOT decoded to text because:

Vector Attributes

The embedding.embeddings.N.embedding.vector attributes MUST contain float arrays, regardless of the API response format:

  1. Float response format: Store vectors directly as float arrays
  2. Base64 response format: MUST decode base64-encoded strings to float arrays before recording
    • Base64 encoding is ~25% more compact in transmission but must be decoded for consistency
    • Example: “AACAPwAAAEA=” → [1.5, 2.0]

Attributes Not Used in Embedding Spans

The following attributes that are used in LLM spans are not applicable to embedding spans:

Rationale

The llm.system attribute is defined as “the AI product as identified by the client or server instrumentation.” While this definition has been reserved for API providers in LLM spans (e.g., “openai”, “anthropic”), it is ambiguous when applied to embedding operations.

In terms of conceptualization, llm.system describes the shape of the API, while llm.provider describes the owner of the physical hardware that runs those APIs. For observability products like Arize and Phoenix, these conventions are primarily consumed in playground features, allowing re-invocation of LLM calls.

For embedding operations:

Therefore, to avoid ambiguity and maintain clear semantic conventions, embedding spans use embedding.model_name rather than llm.system or llm.provider.

Privacy Considerations

When OPENINFERENCE_HIDE_EMBEDDINGS_VECTORS is set to true:

When OPENINFERENCE_HIDE_EMBEDDINGS_TEXT is set to true:

Input/Output Structure

The response structure matches the input structure:

Input formats (cannot mix text and tokens in one request):

Examples

Text Input (Recorded in Traces)

A span for generating embeddings from text:

{
    "name": "CreateEmbeddings",
    "span_kind": "SPAN_KIND_INTERNAL",
    "attributes": {
        "openinference.span.kind": "EMBEDDING",
        "embedding.model_name": "text-embedding-3-small",
        "embedding.invocation_parameters": "{\"model\": \"text-embedding-3-small\", \"encoding_format\": \"float\"}",
        "input.value": "{\"input\": \"hello world\", \"model\": \"text-embedding-3-small\", \"encoding_format\": \"float\"}",
        "input.mime_type": "application/json",
        "output.value": "{\"data\": [{\"embedding\": [0.1, 0.2, 0.3], \"index\": 0}], \"model\": \"text-embedding-3-small\", \"usage\": {\"prompt_tokens\": 2, \"total_tokens\": 2}}",
        "output.mime_type": "application/json",
        "embedding.embeddings.0.embedding.text": "hello world",
        "embedding.embeddings.0.embedding.vector": [0.1, 0.2, 0.3],
        "llm.token_count.prompt": 2,
        "llm.token_count.total": 2
    }
}

Token Input (No Text Attributes)

When input consists of pre-tokenized integer arrays, text attributes are NOT recorded:

{
    "name": "CreateEmbeddings",
    "span_kind": "SPAN_KIND_INTERNAL",
    "attributes": {
        "openinference.span.kind": "EMBEDDING",
        "embedding.model_name": "text-embedding-3-small",
        "embedding.invocation_parameters": "{\"model\": \"text-embedding-3-small\", \"encoding_format\": \"float\"}",
        "input.value": "{\"input\": [15339, 1917], \"model\": \"text-embedding-3-small\", \"encoding_format\": \"float\"}",
        "input.mime_type": "application/json",
        "output.value": "{\"data\": [{\"embedding\": [0.1, 0.2, 0.3], \"index\": 0}], \"model\": \"text-embedding-3-small\", \"usage\": {\"prompt_tokens\": 2, \"total_tokens\": 2}}",
        "output.mime_type": "application/json",
        "embedding.embeddings.0.embedding.vector": [0.1, 0.2, 0.3],
        "llm.token_count.prompt": 2,
        "llm.token_count.total": 2
    }
}

Batch Text Input (Multiple Embeddings)

A span for generating embeddings from multiple text inputs:

{
    "name": "CreateEmbeddings",
    "span_kind": "SPAN_KIND_INTERNAL",
    "attributes": {
        "openinference.span.kind": "EMBEDDING",
        "embedding.model_name": "text-embedding-ada-002",
        "embedding.invocation_parameters": "{\"model\": \"text-embedding-ada-002\"}",
        "input.value": "[\"hello\", \"world\", \"test\"]",
        "embedding.embeddings.0.embedding.text": "hello",
        "embedding.embeddings.0.embedding.vector": [0.1, 0.2, 0.3],
        "embedding.embeddings.1.embedding.text": "world",
        "embedding.embeddings.1.embedding.vector": [0.4, 0.5, 0.6],
        "embedding.embeddings.2.embedding.text": "test",
        "embedding.embeddings.2.embedding.vector": [0.7, 0.8, 0.9],
        "llm.token_count.prompt": 3,
        "llm.token_count.total": 3
    }
}