Commands
EZVals has three main commands:ezvals serve- Start the web UI to browse and run evaluations interactivelyezvals run- Run evaluations headlessly (for CI/CD pipelines)ezvals export- Export a run to various formats (JSON, CSV, Markdown)
Programmatic Invocation
You can call the SDK equivalent ofezvals run from Python:
ezvals serve
Start the web UI to discover and run evaluations interactively.PATH can be:
- A directory:
ezvals serve evals/ - A file:
ezvals serve evals/customer_service.py - A specific function:
ezvals serve evals.py::test_refund - A run JSON file:
ezvals serve .ezvals/sessions/default/run_123.json
--no-open to disable browser launch. Evaluations are discovered and displayed but not run until you click the Run button.
Loading Previous Runs
You can load a previous run by passing the JSON file path directly:Options
Filter evaluations by dataset.
Filter evaluations by label. Can be specified multiple times.
Directory for JSON results storage.
Port for web UI server.
Name for this evaluation session. Groups related runs together.
Open an existing run in the current session by run name. If not found, this becomes the pending run name for the next run.
Comma-separated run names (2-4) to pre-load comparison mode at startup. These are resolved to
compare_run_id query params in the opened URL.Initial search text applied at startup.
Initial error filter for startup state.
Initial trace URL filter for startup state.
Initial trace messages filter for startup state.
Initial annotation filter (
any, yes, or no).Automatically run all evaluations on startup. Same as clicking the Run button immediately.
Control whether
ezvals serve automatically opens your browser.ezvals run
Run evaluations headlessly. Outputs minimal text by default (optimized for LLM agents). Use--visual for rich table output.
PATH can be:
- A directory:
ezvals run evals/ - A file:
ezvals run evals/customer_service.py - A specific function:
ezvals run evals.py::test_refund - A case variant:
ezvals run evals.py::test_math[2-3-5]
Filtering Options
Filter evaluations by dataset. Can be specified multiple times.
Filter evaluations by label. Can be specified multiple times.
Limit the number of evaluations to run.
Execution Options
Number of concurrent evaluations.
1 means sequential execution.Global timeout in seconds for all evaluations.
Output Options
Show stdout from eval functions (print statements, logs).
Show rich progress dots, results table, and summary. Without this flag, output is minimal.
Override the default results path. When specified, results are saved only to this path (not to
.ezvals/runs/).Skip saving results to file. Outputs JSON to stdout instead.
Session Options
Name for this evaluation session. Groups related runs together.
Name for this specific run. Used as file prefix.
ezvals export
Export a run file to various formats. Useful for sharing results, generating reports, or integrating with other tools.RUN_PATH is the path to a run JSON file (e.g., .ezvals/sessions/default/run_123.json).
Options
Export format:
json, csv, or md.Output file path. Defaults to
{run_name}.{format}.Export Formats
| Format | Description |
|---|---|
json | Copy the raw JSON file |
csv | Flat CSV with all results |
md | Markdown with ASCII bar charts and results table |
Examples
Examples
Start the Web UI
Run All Evaluations
Run Specific File
Run Specific Function
Run Parametrized Variant
Filter by Dataset and Label
Run with Concurrency and Timeout
Export Results
Verbose Debug Run
Production CI Pipeline
Session Tracking
Configuration File
EZVals supports aezvals.json config file for persisting default CLI options. The file is auto-generated in your project root on first run.
Default Config
Supported Options
| Option | Type | Description | Used by |
|---|---|---|---|
concurrency | integer | Number of concurrent evaluations | run |
timeout | float | Global timeout in seconds | run |
verbose | boolean | Show stdout from eval functions | run |
results_dir | string | Directory for results storage | serve |
port | integer | Web UI server port | serve |
Precedence
CLI flags always override config values:Editing via UI
Click the settings icon in the web UI header to view and edit config values. Changes are saved toezvals.json.
Exit Codes
| Code | Meaning |
|---|---|
| 0 | Evaluations completed (regardless of pass/fail) |
| Non-zero | Error during execution (bad path, exceptions, etc.) |
The CLI does not currently set non-zero exit codes for failed evaluations—only for execution errors. Check the JSON output or summary for pass/fail status.
Environment Variables
| Variable | Description |
|---|---|
EZVALS_CONCURRENCY | Default concurrency level |
EZVALS_TIMEOUT | Default timeout in seconds |
Output Format
Minimal Output (Default)
By default,ezvals run outputs minimal text optimized for LLM agents and CI pipelines:
Visual Output (—visual)
Use--visual for rich progress dots, results table, and summary:
JSON File Output
Results are always saved as JSON to.ezvals/runs/ (or custom path via -o):

