Embedding spans capture operations that convert text or token IDs into dense float vectors for semantic search, clustering, and similarity comparison.
The span name MUST be "CreateEmbeddings" for embedding operations.
All embedding spans MUST include:
openinference.span.kind: Set to "EMBEDDING"Embedding spans typically include:
embedding.model_name: Name of the embedding model (e.g., “text-embedding-3-small”)embedding.embeddings: Nested structure for embedding objects in batch operationsembedding.invocation_parameters: JSON string of parameters sent to the model (excluding input)input.value: The raw input as a JSON string (text strings or token ID arrays)input.mime_type: Usually “application/json”output.value: The raw output (embedding vectors as JSON or base64-encoded)output.mime_type: Usually “application/json”The embedding.embeddings.N.embedding.text attributes are populated ONLY when the input is already text (strings). These attributes are recorded during the request phase to ensure availability even on errors.
Token IDs (pre-tokenized integer arrays) are NOT decoded to text because:
The embedding.embeddings.N.embedding.vector attributes MUST contain float arrays, regardless of the API response format:
The following attributes that are used in LLM spans are not applicable to embedding spans:
llm.system: Not used for embedding spansllm.provider: Not used for embedding spansThe llm.system attribute is defined as “the AI product as identified by the client or server instrumentation.” While this definition has been reserved for API providers in LLM spans (e.g., “openai”, “anthropic”), it is ambiguous when applied to embedding operations.
In terms of conceptualization, llm.system describes the shape of the API, while llm.provider describes the owner of the physical hardware that runs those APIs. For observability products like Arize and Phoenix, these conventions are primarily consumed in playground features, allowing re-invocation of LLM calls.
For embedding operations:
embedding.model_name attribute provides sufficient identification of the embedding model being used"EMBEDDING" clearly identifies the operation typellm.systemTherefore, to avoid ambiguity and maintain clear semantic conventions, embedding spans use embedding.model_name rather than llm.system or llm.provider.
When OPENINFERENCE_HIDE_EMBEDDINGS_VECTORS is set to true:
embedding.embeddings.N.embedding.vector attribute will contain "__REDACTED__"When OPENINFERENCE_HIDE_EMBEDDINGS_TEXT is set to true:
embedding.embeddings.N.embedding.text attribute will contain "__REDACTED__"The response structure matches the input structure:
data[0] with one embeddingdata[0..N-1] with N embeddingsInput formats (cannot mix text and tokens in one request):
"hello world" → single embedding["hello", "world"] → array of embeddings[15339, 1917] → single embedding[[15339, 1917], [991, 1345]] → array of embeddingsA span for generating embeddings from text:
{
"name": "CreateEmbeddings",
"span_kind": "SPAN_KIND_INTERNAL",
"attributes": {
"openinference.span.kind": "EMBEDDING",
"embedding.model_name": "text-embedding-3-small",
"embedding.invocation_parameters": "{\"model\": \"text-embedding-3-small\", \"encoding_format\": \"float\"}",
"input.value": "{\"input\": \"hello world\", \"model\": \"text-embedding-3-small\", \"encoding_format\": \"float\"}",
"input.mime_type": "application/json",
"output.value": "{\"data\": [{\"embedding\": [0.1, 0.2, 0.3], \"index\": 0}], \"model\": \"text-embedding-3-small\", \"usage\": {\"prompt_tokens\": 2, \"total_tokens\": 2}}",
"output.mime_type": "application/json",
"embedding.embeddings.0.embedding.text": "hello world",
"embedding.embeddings.0.embedding.vector": [0.1, 0.2, 0.3],
"llm.token_count.prompt": 2,
"llm.token_count.total": 2
}
}
When input consists of pre-tokenized integer arrays, text attributes are NOT recorded:
{
"name": "CreateEmbeddings",
"span_kind": "SPAN_KIND_INTERNAL",
"attributes": {
"openinference.span.kind": "EMBEDDING",
"embedding.model_name": "text-embedding-3-small",
"embedding.invocation_parameters": "{\"model\": \"text-embedding-3-small\", \"encoding_format\": \"float\"}",
"input.value": "{\"input\": [15339, 1917], \"model\": \"text-embedding-3-small\", \"encoding_format\": \"float\"}",
"input.mime_type": "application/json",
"output.value": "{\"data\": [{\"embedding\": [0.1, 0.2, 0.3], \"index\": 0}], \"model\": \"text-embedding-3-small\", \"usage\": {\"prompt_tokens\": 2, \"total_tokens\": 2}}",
"output.mime_type": "application/json",
"embedding.embeddings.0.embedding.vector": [0.1, 0.2, 0.3],
"llm.token_count.prompt": 2,
"llm.token_count.total": 2
}
}
A span for generating embeddings from multiple text inputs:
{
"name": "CreateEmbeddings",
"span_kind": "SPAN_KIND_INTERNAL",
"attributes": {
"openinference.span.kind": "EMBEDDING",
"embedding.model_name": "text-embedding-ada-002",
"embedding.invocation_parameters": "{\"model\": \"text-embedding-ada-002\"}",
"input.value": "[\"hello\", \"world\", \"test\"]",
"embedding.embeddings.0.embedding.text": "hello",
"embedding.embeddings.0.embedding.vector": [0.1, 0.2, 0.3],
"embedding.embeddings.1.embedding.text": "world",
"embedding.embeddings.1.embedding.vector": [0.4, 0.5, 0.6],
"embedding.embeddings.2.embedding.text": "test",
"embedding.embeddings.2.embedding.vector": [0.7, 0.8, 0.9],
"llm.token_count.prompt": 3,
"llm.token_count.total": 3
}
}