• Resume incomplete evaluations for an experiment.

    This function identifies which evaluations have not been completed (either missing or failed) and runs the evaluators only for those runs. This is useful for:

    • Recovering from transient evaluator failures
    • Adding new evaluators to completed experiments
    • Completing partially evaluated experiments

    The function processes incomplete evaluations in batches using pagination to minimize memory usage.

    Evaluation names are matched to evaluator names. For example, if you pass an evaluator with name "accuracy", it will check for and resume any runs missing the "accuracy" evaluation.

    Note: Multi-output evaluators (evaluators that return an array of results) are not supported for resume operations. Each evaluator should produce a single evaluation result with a name matching the evaluator's name.

    Parameters

    Returns Promise<void>

    Throws different error types based on failure:

    • "EvaluationFetchError": Unable to fetch incomplete evaluations from the server. Always thrown regardless of stopOnFirstError, as it indicates critical infrastructure failure.
    • "EvaluationAbortedError": stopOnFirstError=true and an evaluator failed. Original error preserved in cause property.
    • Generic Error: Other evaluator execution errors or unexpected failures.
    import { resumeEvaluation } from "@arizeai/phoenix-client/experiments";

    // Standard usage: evaluation name matches evaluator name
    try {
    await resumeEvaluation({
    experimentId: "exp_123",
    evaluators: [{
    name: "correctness",
    kind: "CODE",
    evaluate: async ({ output, expected }) => ({
    score: output === expected ? 1 : 0
    })
    }],
    });
    } catch (error) {
    // Handle by error name (no instanceof needed)
    if (error.name === "EvaluationFetchError") {
    console.error("Failed to connect to server:", error.cause);
    } else if (error.name === "EvaluationAbortedError") {
    console.error("Evaluation stopped due to error:", error.cause);
    } else {
    console.error("Unexpected error:", error);
    }
    }

    // Stop on first error (useful for debugging)
    await resumeEvaluation({
    experimentId: "exp_123",
    evaluators: [myEvaluator],
    stopOnFirstError: true, // Exit immediately on first failure
    });