ResponseRelevancy

Assesses how pertinent the generated answer is to the given prompt.

The evaluation metric, Response Relevancy, focuses on assessing how pertinent the generated answer is to the given prompt. A lower score is assigned to answers that are incomplete or contain redundant information and higher scores indicate better relevancy. This metric is computed using the user_input, the retrieved_contexts and the response.

The Response Relevancy is defined as the mean cosine similarity of the original user_input to a number of artificial questions, which are generated (reverse-engineered) based on the response:

\[ \\text{answer relevancy} = \\frac{1}{N} \\sum_{i=1}^{N} cos(E_{g_i}, E_o) \] \[ \\text{answer relevancy} = \\frac{1}{N} \\sum_{i=1}^{N} \\frac{E_{g_i} \\cdot E_o}{\\|E_{g_i}\\|\\|E_o\\|} \]

Where: - \(E_{g_i}\) is the embedding of the generated question \(i\). - \(E_o\) is the embedding of the original question. - \(N\) is the number of generated questions - 3 by default.

Note: This is a reference-free metric, meaning that it does not require a ground_truth answer to compare against. A similar metric that does evaluate the correctness of a generated answers with respect to a ground_truth answer is validmind.model_validation.ragas.AnswerCorrectness.

Configuring Columns

This metric requires the following columns in your dataset:

user_input (str): The text query that was input into the model.
retrieved_contexts (List[str]): Any contextual information retrieved by the model before generating an answer.
response (str): The response generated by the model.

If the above data is not in the appropriate column, you can specify different column names for these fields using the parameters question_column, answer_column, and contexts_column.

For example, if your dataset has this data stored in different columns, you can pass the following parameters:

params = {
user_input_column": "input_text",
response_column": "output_text",
retrieved_contexts_column": "context_info
}

If answer and contexts are stored as a dictionary in another column, specify the column and key like this:

pred_col = dataset.prediction_column(model)
params = {
response_column": f"{pred_col}.generated_answer",
retrieved_contexts_column": f"{pred_col}.contexts",
}

For more complex data structures, you can use a function to extract the answers:

pred_col = dataset.prediction_column(model)
params = {
response_column": lambda row: "\\n\\n".join(row[pred_col]["messages"]),
retrieved_contexts_column": lambda row: [row[pred_col]["context_message"]],
}