Skip to main content

Agent Skill

EZVals includes a skill that teaches AI coding agents how to write and analyze evaluations.

What’s a Skill?

Skills are markdown files that provide context and instructions to AI coding agents like Claude Code, Cursor, and others. When you ask your agent to help with evals, it can reference the skill to understand:
  • EZVals API and patterns
  • Best practices for eval design
  • Grading strategies (code vs model vs human)
  • Patterns for different agent types

Installation

From Package (Version-Matched)

Install the skill that matches your installed EZVals version:
pip install ezvals
ezvals skills add --claude
This installs to your current project directory.

Global Installation

Install globally for all projects:
ezvals skills add --global --claude

Target Flags

ezvals skills add requires at least one explicit target flag.
  • --agents installs canonical source to .agents/skills/evals/
  • --claude, --codex, --cursor, --windsurf, --kiro, --roo install/link those agent directories
  • You can combine flags, like ezvals skills add --claude --codex
  • If --agents is included with other targets, .agents is canonical and selected agent targets link to it

From Marketplace (Latest)

Install the latest version from the skills marketplace:
npx skills add camronh/evals-skill

Usage

After installation, invoke the skill in your AI coding agent:
/evals
Or just ask about evals - agents will automatically detect and use the skill based on context.

What’s Included

The skill provides comprehensive guidance across multiple reference files:
FilePurpose
SKILL.mdOverview and navigation hub
EZVALS_REFERENCE.mdComplete API reference for @eval, EvalContext, @parametrize
BEST_PRACTICES.mdEval design principles from Anthropic’s research
GRADERS.mdChoosing between code, model, and human graders
AGENT_EVALS.mdPatterns for coding, conversational, research agents
ROADMAP.mdZero-to-one guide for building evals

Example Prompts

Try these prompts with your AI coding agent:

Getting Started

  • “Help me write my first eval for my customer support agent”
  • “What’s the best way to evaluate my RAG pipeline?”
  • “Set up an eval suite for my coding assistant”

Writing Evals

  • “Create an eval that tests my agent’s ability to handle refund requests”
  • “Write a parametrized eval for testing sentiment analysis across edge cases”
  • “How do I test multi-turn conversations?”

Improving Evals

  • “My eval is flaky - help me make it more deterministic”
  • “Should I use code-based or model-based grading for this eval?”
  • “Review my evals and suggest improvements”

Analysis

  • “Analyze my eval results and suggest improvements”
  • “Help me understand why my agent is failing this eval”
  • “What patterns do you see in these failures?”

Managing the Skill

Check Installation

ezvals skills doctor
Shows installation status, version, and symlink health.

Remove Skill

ezvals skills remove

Reinstall

ezvals skills add --claude
Overwrites existing installation with fresh files.

How It Works

The skill is installed only to the selected target directories:
.claude/skills/evals/     # Selected with --claude
.cursor/skills/evals/     # Selected with --cursor
.codex/skills/evals/      # Selected with --codex
...
If --agents is selected, .agents/skills/evals/ is the canonical source and other selected targets symlink to it.

Keeping Updated

When you upgrade EZVals, run ezvals skills add --claude again (or your chosen target flags) to get the latest skill content:
pip install --upgrade ezvals
ezvals skills add --claude