Add agentic metrics from DeepEval as scorers
validmind-library
2.11.1
enhancement
Four new agentic evaluation metrics from DeepEval are now available to assess your LLM agents:
- ArgumentCorrectness: Checks if agents generate correct arguments for tool calls.
- PlanAdherence: Measures whether agents follow their own execution plans.
- PlanQuality: Evaluates the logical quality, completeness, and efficiency of agent-generated plans.
- ToolCorrectness: Verifies if agents invoke the right tools with appropriate arguments.
These metrics broaden evaluation coverage across both tool use (actions) and strategic reasoning (plans). They are accessible via validmind.scorers.llm.deepeval and require validmind[llm].