Create a custom study
A study is a JSON file in config/studies/ that defines which formats,
quality levels, and encoder parameters to test. Creating a new study
requires no code changes.
Create a study file
Section titled “Create a study file”Create a new file in config/studies/. The filename (without .json)
becomes the study ID used in all CLI commands:
# config/studies/my-study.json → study ID is "my-study"Minimal example
Section titled “Minimal example”Compare AVIF and WebP at a few quality levels:
{ "$schema": "../study.schema.json", "id": "my-study", "name": "My Custom Study", "description": "Compare AVIF and WebP at representative quality levels.", "dataset": { "id": "div2k-valid", "max_images": 10 }, "encoders": [ { "format": "avif", "quality": [50, 65, 80], "speed": 4 }, { "format": "webp", "quality": [50, 65, 80], "method": 4 } ]}Parameter sweep example
Section titled “Parameter sweep example”Sweep quality as a range and test multiple speed settings:
{ "$schema": "../study.schema.json", "id": "avif-deep-sweep", "name": "AVIF Deep Quality Sweep", "description": "Fine-grained quality sweep with speed variants.", "dataset": { "id": "div2k-valid", "max_images": 20 }, "encoders": [ { "format": "avif", "quality": { "start": 30, "stop": 90, "step": 5 }, "speed": [2, 4, 6] } ]}The range {"start": 30, "stop": 90, "step": 5} expands to
[30, 35, 40, 45, ..., 85, 90].
Multi-resolution example
Section titled “Multi-resolution example”Test how resolution affects encoding efficiency:
{ "$schema": "../study.schema.json", "id": "resolution-test", "name": "Resolution Impact Test", "description": "Compare encoding at different target resolutions.", "dataset": { "id": "div2k-valid", "max_images": 10 }, "encoders": [ { "format": "avif", "quality": [50, 65, 80], "speed": 4, "resolution": [1920, 1280, 640] } ]}Crop-impact example
Section titled “Crop-impact example”Study how encoded image area affects one format without changing the measured content or fragment resolution:
{ "$schema": "../study.schema.json", "id": "webp-crop-impact-custom", "name": "WebP Crop Impact Custom", "description": "Measure WebP on different crop sizes around a fixed fragment.", "dataset": { "id": "div2k-valid", "max_images": 20 }, "analysis_fragment_size": 200, "crop_too_small_strategy": "skip_image", "analysis": { "x_axis": "crop", "group_by": "quality" }, "comparison": { "tile_parameter": "crop", "targets": [ { "metric": "ssimulacra2", "values": [70, 80, 90] }, { "metric": "bits_per_pixel", "values": [0.4, 0.8, 1.6] } ] }, "encoders": [ { "format": "webp", "quality": [40, 60, 80, 100], "method": 6, "crop": [2048, 1600, 1200, 800, 400] } ]}Unlike a resolution study, this keeps the analysis fragment at full resolution
and only changes how much surrounding context is encoded.
Study configuration fields
Section titled “Study configuration fields”Top-level fields
Section titled “Top-level fields”| Field | Type | Required | Description |
|---|---|---|---|
id | string | Yes | Must match the filename (without .json). Pattern: ^[a-z0-9][a-z0-9_-]*$ |
name | string | Yes | Human-readable name. |
description | string | No | Purpose of the study. |
time_budget | number | No | Default time budget in seconds. Overridden by CLI --time-budget. |
dataset.id | string | Yes | Dataset identifier from config/datasets.json. |
dataset.max_images | integer | No | Limit images used from the dataset. |
analysis_fragment_size | integer | No | Square fragment size measured in crop-impact studies. Defaults to 200. |
crop_too_small_strategy | string | No | What to do if a crop level cannot contain the fragment: skip_image, skip_measurement, or adjust_aspect_ratio. |
analysis | object | No | Plot configuration overrides such as x_axis and group_by. |
comparison | object | No | Comparison-figure configuration such as tile_parameter, targets, and exclude_images. |
encoders | array | Yes | List of encoder configurations (at least one). |
Encoder fields
Section titled “Encoder fields”| Field | Type | Required | Description |
|---|---|---|---|
format | string | Yes | "jpeg", "webp", "avif", or "jxl". |
quality | int/list/range | Yes | Quality settings (0–100). See formats below. |
speed | int/int[] | No | AVIF speed (0–10). 0 = slowest/best, 10 = fastest. |
effort | int/int[] | No | JXL effort (1–10). Higher = slower/better. |
method | int/int[] | No | WebP method (0–6). Higher = slower/better. |
chroma_subsampling | string[] | No | Modes: "444", "422", "420", "400". |
resolution | int/int[] | No | Target resolution(s) in pixels (longest edge). |
crop | int/int[] | No | Target crop longest-edge values in pixels for crop-impact studies. |
extra_args | object | No | Additional CLI arguments as key-value pairs. |
Do not combine resolution and crop on the same encoder entry. They model two
different study types.
analysis section
Section titled “analysis section”Use this to override the automatically chosen plot axes:
{ "analysis": { "x_axis": "crop", "group_by": "quality" }}comparison section
Section titled “comparison section”Use this to control comparison figure generation directly from the study config:
{ "comparison": { "tile_parameter": "crop", "targets": [ { "metric": "ssimulacra2", "values": [70, 80, 90] }, { "metric": "bits_per_pixel", "values": [0.4, 0.8, 1.6] } ], "exclude_images": ["0828.png"] }}Quality specification formats
Section titled “Quality specification formats”| Format | Example | Expands to |
|---|---|---|
| Single integer | 75 | [75] |
| Explicit list | [60, 75, 90] | [60, 75, 90] |
| Range object | {"start": 30, "stop": 90, "step": 10} | [30, 40, 50, 60, 70, 80, 90] |
Using extra_args
Section titled “Using extra_args”The extra_args field passes additional CLI arguments directly to the
encoder tool. Keys are argument names (without leading dashes) and values
are argument values:
{ "format": "avif", "quality": [60, 75], "speed": 4, "extra_args": { "sharpness": 2, "ignore-exif": true }}Note that extra_args values are recorded in the quality results for
traceability but are not currently passed to the encoder CLI automatically.
To use them, the encoder method in src/encoder.py needs to be updated.
See Extend formats and metrics for guidance
on exposing additional encoder parameters.
Run the study
Section titled “Run the study”-
Ensure the dataset is available:
Terminal window just fetch div2k-valid -
Run the pipeline with a time budget:
Terminal window just pipeline my-study 30m -
Analyze results:
Terminal window just analyze my-study -
Generate visual comparisons:
Terminal window just compare my-study -
Generate a report (includes all studies with results):
Terminal window just report
Schema validation
Section titled “Schema validation”Adding "$schema": "../study.schema.json" to your file enables in-editor
validation and autocompletion in VS Code. The schema enforces valid
format names, quality ranges, and parameter bounds.
-
Start small: Use
max_images: 5and a short time budget (5m) to verify your config before running a full study. -
Dry run: Preview what would run without executing:
Terminal window python3 scripts/run_pipeline.py my-study --time-budget 5m --dry-run -
Parameter count: The pipeline runs every combination of parameters. With 10 quality levels × 4 speed settings × 2 chroma modes = 80 variants per image — plan time budgets accordingly.
-
Clean up: Remove a study’s data with
just clean-study my-study.
See also
Section titled “See also”- Run the pipeline — time budgets, advanced options, output structure
- Analyze results — understand CSV and plot outputs
- Configuration reference — full schema details
- Extend formats and metrics — expose additional encoder parameters