• Creates a tool invocation evaluator function.

    This function returns an evaluator that determines whether a tool was invoked correctly with proper arguments, formatting, and safe content.

    The evaluator checks for:

    • Properly structured JSON (if applicable)
    • All required fields/parameters present
    • No hallucinated or nonexistent fields
    • Argument values matching user query and schema expectations
    • No unsafe content (e.g., PII) in arguments

    Type Parameters

    Parameters

    Returns ClassificationEvaluator<RecordType>

    An evaluator function that takes a ToolInvocationEvaluationRecord and returns a classification result indicating whether the tool invocation is correct or incorrect.

    const evaluator = createToolInvocationEvaluator({ model: openai("gpt-4o-mini") });

    // Example with JSON schema format for available tools
    const result = await evaluator.evaluate({
    input: "User: Book a flight from NYC to LA for tomorrow",
    availableTools: JSON.stringify({
    name: "book_flight",
    description: "Book a flight between two cities",
    parameters: {
    type: "object",
    properties: {
    origin: { type: "string", description: "Departure city code" },
    destination: { type: "string", description: "Arrival city code" },
    date: { type: "string", description: "Flight date in YYYY-MM-DD" }
    },
    required: ["origin", "destination", "date"]
    }
    }),
    toolSelection: 'book_flight(origin="NYC", destination="LA", date="2024-01-15")'
    });
    console.log(result.label); // "correct" or "incorrect"