• A factory function for creating a custom evaluator from any function.

    This function wraps a user-provided function into an evaluator that can be used with Phoenix experiments and evaluations. The function can be synchronous or asynchronous, and can return a number, an EvaluationResult object, or a value that will be automatically converted to an evaluation result.

    The evaluator will automatically:

    • Convert the function's return value to an EvaluationResult
    • Handle both sync and async functions
    • Wrap the function with OpenTelemetry spans if telemetry is enabled
    • Infer the evaluator name from the function name if not provided

    Type Parameters

    • RecordType extends Record<string, unknown> = Record<string, unknown>

      The type of the input record that the evaluator expects. Must extend Record<string, unknown>.

    • Fn extends AnyFn = AnyFn

      The type of the function being wrapped. Must be a function that accepts the record type and returns a value compatible with EvaluationResult.

    Parameters

    • fn: Fn

      The function to wrap as an evaluator. Can be synchronous or asynchronous. The function should accept a record of type RecordType and return either:

      • A number (will be converted to { score: number })
      • An EvaluationResult object
      • Any value that can be converted to an evaluation result
    • Optionaloptions: CreateEvaluatorOptions

      Optional configuration for the evaluator. See CreateEvaluatorOptions for details on available options.

    Returns EvaluatorBase<RecordType>

    An EvaluatorInterface that can be used with Phoenix experiments and evaluation workflows.

    Basic usage with a simple scoring function:

    const accuracyEvaluator = createEvaluator(
    ({ output, expected }) => {
    return output === expected ? 1 : 0;
    },
    {
    name: "accuracy",
    kind: "CODE",
    optimizationDirection: "MAXIMIZE"
    }
    );

    const result = await accuracyEvaluator.evaluate({
    output: "correct answer",
    expected: "correct answer"
    });
    // result: { score: 1 }

    Returning a full EvaluationResult:

    const qualityEvaluator = createEvaluator(
    ({ output }) => {
    const score = calculateQuality(output);
    return {
    score,
    label: score > 0.8 ? "high" : "low",
    explanation: `Quality score: ${score}`
    };
    },
    { name: "quality" }
    );