Agent Skill

EZVals includes a skill that teaches AI coding agents how to write and analyze evaluations.

What’s a Skill?

Skills are markdown files that provide context and instructions to AI coding agents like Claude Code, Cursor, and others. When you ask your agent to help with evals, it can reference the skill to understand:

EZVals API and patterns
Best practices for eval design
Grading strategies (code vs model vs human)
Patterns for different agent types

Installation

From Package (Version-Matched)

Install the skill that matches your installed EZVals version:

pip install ezvals
ezvals skills add --claude

This installs to your current project directory.

Global Installation

Install globally for all projects:

ezvals skills add --global --claude

Target Flags

ezvals skills add requires at least one explicit target flag.

--agents installs canonical source to .agents/skills/evals/
--claude, --codex, --cursor, --windsurf, --kiro, --roo install/link those agent directories
You can combine flags, like ezvals skills add --claude --codex
If --agents is included with other targets, .agents is canonical and selected agent targets link to it

From Marketplace (Latest)

Install the latest version from the skills marketplace:

npx skills add camronh/evals-skill

Usage

After installation, invoke the skill in your AI coding agent:

/evals

Or just ask about evals - agents will automatically detect and use the skill based on context.

What’s Included

The skill provides comprehensive guidance across multiple reference files:

File	Purpose
SKILL.md	Overview and navigation hub
EZVALS_REFERENCE.md	Complete API reference for @eval, EvalContext, @parametrize
BEST_PRACTICES.md	Eval design principles from Anthropic’s research
GRADERS.md	Choosing between code, model, and human graders
AGENT_EVALS.md	Patterns for coding, conversational, research agents
ROADMAP.md	Zero-to-one guide for building evals

Example Prompts

Try these prompts with your AI coding agent:

Getting Started

“Help me write my first eval for my customer support agent”
“What’s the best way to evaluate my RAG pipeline?”
“Set up an eval suite for my coding assistant”

Writing Evals

“Create an eval that tests my agent’s ability to handle refund requests”
“Write a parametrized eval for testing sentiment analysis across edge cases”
“How do I test multi-turn conversations?”

Improving Evals

“My eval is flaky - help me make it more deterministic”
“Should I use code-based or model-based grading for this eval?”
“Review my evals and suggest improvements”

Analysis

“Analyze my eval results and suggest improvements”
“Help me understand why my agent is failing this eval”
“What patterns do you see in these failures?”

Managing the Skill

Check Installation

ezvals skills doctor

Shows installation status, version, and symlink health.

Remove Skill

ezvals skills remove

Reinstall

ezvals skills add --claude

Overwrites existing installation with fresh files.

How It Works

The skill is installed only to the selected target directories:

.claude/skills/evals/     # Selected with --claude
.cursor/skills/evals/     # Selected with --cursor
.codex/skills/evals/      # Selected with --codex
...

If --agents is selected, .agents/skills/evals/ is the canonical source and other selected targets symlink to it.

Keeping Updated

When you upgrade EZVals, run ezvals skills add --claude again (or your chosen target flags) to get the latest skill content:

pip install --upgrade ezvals
ezvals skills add --claude

​Agent Skill

​What’s a Skill?

​Installation

​From Package (Version-Matched)

​Global Installation

​Target Flags

​From Marketplace (Latest)

​Usage

​What’s Included

​Example Prompts

​Getting Started

​Writing Evals

​Improving Evals

​Analysis

​Managing the Skill

​Check Installation

​Remove Skill

​Reinstall

​How It Works

​Keeping Updated