openinference

Multimodal Attributes

This document describes how message content arrays represent multimodal content (text, images, audio) in OpenInference spans. The same message.contents structure is also used for reasoning and provider-native tool-use parts when item ordering must be preserved.

Message Content Arrays

When a message contains multiple content items (e.g., text and images), the content is represented using the message.contents array structure with flattened attributes.

Attribute Pattern

llm.input_messages.<messageIndex>.message.contents.<contentIndex>.message_content.<attribute>

Where:

<messageIndex> is the zero-based index of the message
<contentIndex> is the zero-based index of the content item within the message
<attribute> is the specific content attribute

Content Type Attributes

Each content item has a type attribute that identifies its kind:

"text" - Text content
"image" - Image content (URL or base64)
"audio" - Audio content (URL or base64)
"reasoning" - Reasoning or thinking content, including visible summaries and Anthropic redacted_thinking
"tool_use" - Provider-native tool-use part when a tool call must remain ordered relative to adjacent content items

Reasoning-specific fields such as message_content.id, message_content.signature, message_content.data, and message_content.encrypted_content are defined in LLM Spans.

Text Content

llm.input_messages.0.message.contents.0.message_content.type = "text"
llm.input_messages.0.message.contents.0.message_content.text = "What is in this image?"

Image Content

llm.input_messages.0.message.contents.1.message_content.type = "image"
llm.input_messages.0.message.contents.1.message_content.image.image.url = "https://example.com/image.jpg"

For base64-encoded images:

llm.input_messages.0.message.contents.1.message_content.type = "image"
llm.input_messages.0.message.contents.1.message_content.image.image.url = "data:image/png;base64,iVBORw0KGgo..."

Audio Content

llm.input_messages.0.message.contents.2.message_content.type = "audio"
llm.input_messages.0.message.contents.2.message_content.audio.audio.url = "https://example.com/audio.mp3"

Privacy Considerations

Hiding Images

When OPENINFERENCE_HIDE_INPUT_IMAGES is set to true:

Image URLs in input messages will be replaced with "__REDACTED__"
This only applies when input messages are not already completely hidden

Base64 Image Truncation

When OPENINFERENCE_BASE64_IMAGE_MAX_LENGTH is set (default: 32000):

Base64-encoded images longer than this limit will be truncated
The truncation preserves the data URL prefix (e.g., data:image/png;base64,)
Only the base64 content portion is subject to the length limit

Hiding Text Content

When OPENINFERENCE_HIDE_INPUT_TEXT is set to true:

Text content in multimodal messages will be replaced with "__REDACTED__"
This only applies when input messages are not already completely hidden

Example: Multimodal Message

A user message with both text and image content:

{
  "llm.input_messages.0.message.role": "user",
  "llm.input_messages.0.message.contents.0.message_content.type": "text",
  "llm.input_messages.0.message.contents.0.message_content.text": "What objects do you see in this image?",
  "llm.input_messages.0.message.contents.1.message_content.type": "image",
  "llm.input_messages.0.message.contents.1.message_content.image.image.url": "https://example.com/photo.jpg"
}

Fallback for Simple Messages

When a message contains only text content (no multimodal content), it can use the simpler format:

{
  "llm.input_messages.0.message.role": "user",
  "llm.input_messages.0.message.content": "Hello, how are you?"
}

The message.content attribute is used for simple text-only messages, while message.contents is used for multimodal messages.

This site is open source. Improve this page.