Install
Quick Example
Evaluating a simple sentiment analyzer against a ground truth dataset:@parametrize to apply eval to a dataset:
Web UI
EZVals spins up a local Web UI that makes it easy to filter, run, and rerun evals. Do deep analysis on the results
.json file for further analysis.
Agent Mode
The CLI and SDK make it easy for your coding agent to run, analyze, and iterate on the evals!
Can you run test_sentiment_batch evals and tell me why the scores are so low?
Your coding agent would run:
Existing eval frameworks are frustrating:
Too Opinionated
One function per dataset, rigid patterns. No way to run different logic per
test case.
Cloud-Based
Datasets in the cloud. No version control. Code and data live in different
places.
UI-Based
Your coding agent can’t run evals, analyze results, or iterate on datasets.
Ready to start?
Follow our quickstart guide to set up EZVals in under 5 minutes.


