Agent Evaluation Framework Builder
Designs an eval suite for an LLM agent or pipeline including success metrics, trajectory scoring, LLM-as-judge setup, and regression test cases.
Tags: llm-eval, testing, llm-as-judge, agent-testing, quality
Category: coding
Author: simplyutils