Skip to content

Configuration reference

`config/datasets.json`

Defines all datasets available for fetching. Validated by config/datasets.schema.json.

Fields

Field	Type	Required	Description
`id`	string	Yes	Unique identifier (`^[a-z0-9-]+$`). Used in CLI commands.
`name`	string	Yes	Human-readable name.
`description`	string	Yes	Brief description.
`type`	string	Yes	Archive format: `zip`, `tar`, `tar.gz`, `tgz`, or `folder` (Google Drive).
`url`	string	Yes	Download URL (HTTP/HTTPS, Google Drive, Dropbox).
`size_mb`	number	Yes	Approximate download size in MB.
`image_count`	number	Yes	Number of images.
`resolution`	string	Yes	Resolution description (e.g., “2K”, “4K”).
`format`	string	Yes	Image format (e.g., “PNG”, “JPEG”).
`storage_type`	string	No	Storage provider: `direct` (default), `google_drive`, or `dropbox`.
`folder_id`	string	No	Google Drive folder ID (for folder-type downloads).
`post_process`	string	No	Post-processing action: `extract_multipart_zips` (for LIU4K v2).
`extracted_folder`	string	No	Folder name after extraction.
`rename_to`	string	No	Rename extracted folder to this name.
`license`	string	No	License information.
`source`	string	No	Organization providing the dataset.

Example

{
  "datasets": [
    {
      "id": "div2k-valid",
      "name": "DIV2K Validation",
      "description": "DIV2K validation set with 100 high-quality 2K images",
      "type": "zip",
      "url": "http://data.vision.ee.ethz.ch/cvl/DIV2K/DIV2K_valid_HR.zip",
      "size_mb": 449,
      "image_count": 100,
      "resolution": "2K",
      "format": "PNG",
      "extracted_folder": "DIV2K_valid_HR",
      "rename_to": "DIV2K_valid",
      "license": "Unknown - check DIV2K website",
      "source": "ETH Zurich Computer Vision Lab"
    }
  ]
}

Study configuration files

Study configs live in config/studies/ and define encoding experiments. Validated by config/study.schema.json.

Available studies

File	Study ID	Description
`format-comparison.json`	`format-comparison`	Compare JPEG, WebP, AVIF, JPEG XL
`avif-speed-sweep.json`	`avif-speed-sweep`	AVIF speed parameter sweep
`avif-chroma-subsampling.json`	`avif-chroma-subsampling`	AVIF chroma subsampling comparison
`jxl-effort-sweep.json`	`jxl-effort-sweep`	JPEG XL effort level comparison
`webp-method-sweep.json`	`webp-method-sweep`	WebP method parameter sweep
`resolution-impact.json`	`resolution-impact`	Impact of resolution on quality
`jpeg-crop-impact.json`	`jpeg-crop-impact`	JPEG crop-impact study
`webp-crop-impact.json`	`webp-crop-impact`	WebP crop-impact study
`avif-crop-impact.json`	`avif-crop-impact`	AVIF crop-impact study
`jxl-crop-impact.json`	`jxl-crop-impact`	JPEG XL crop-impact study

Fields

Field	Type	Required	Description
`id`	string	Yes	Unique study identifier.
`name`	string	No	Human-readable name. Defaults to `id`.
`description`	string	No	Purpose of the study.
`time_budget`	number	No	Default time budget in seconds. Overridden by CLI.
`dataset.id`	string	Yes	Dataset identifier from `datasets.json`.
`dataset.max_images`	integer	No	Limit images from dataset.
`analysis.x_axis`	string	No	Override the primary plot x-axis.
`analysis.group_by`	string	No	Override the line-grouping parameter for plots.
`comparison.tile_parameter`	string	No	Parameter that varies within each comparison figure.
`comparison.targets`	array	No	Comparison target groups, each with `metric` and `values`.
`comparison.exclude_images`	string[]	No	Basenames to exclude from automatic comparison source-image selection.
`analysis_fragment_size`	integer	No	Side length of the measured fragment for crop-impact studies. Defaults to `200`.
`crop_too_small_strategy`	string	No	Handling strategy when a crop level cannot contain the fragment.
`encoders`	array	Yes	List of encoder configurations.

Encoder fields

Field	Type	Required	Description
`format`	string	Yes	One of: `jpeg`, `webp`, `avif`, `jxl`.
`quality`	int/list/range	Yes	Quality settings to sweep (0–100).
`chroma_subsampling`	string[]	No	Subsampling modes: `444`, `422`, `420`, `400`.
`speed`	int/int[]	No	AVIF speed setting(s) (0–10).
`effort`	int/int[]	No	JXL effort setting(s) (1–10).
`method`	int/int[]	No	WebP method setting(s) (0–6).
`resolution`	int/int[]	No	Target resolution(s) in pixels (longest edge).
`crop`	int/int[]	No	Target crop longest-edge value(s) in pixels for crop-impact studies.
`extra_args`	object	No	Encoder-specific CLI arguments.

The crop and resolution parameters model different preprocessing modes and should not be combined on the same encoder entry.

Quality specification formats

Format	Example	Expanded
Single integer	`75`	`[75]`
Explicit list	`[60, 75, 90]`	`[60, 75, 90]`
Range object	`{"start": 30, "stop": 90, "step": 10}`	`[30, 40, 50, 60, 70, 80, 90]`

Example: format comparison

{
  "id": "format-comparison",
  "name": "Format Comparison",
  "time_budget": 1800,
  "dataset": { "id": "div2k-valid", "max_images": 10 },
  "encoders": [
    { "format": "jpeg", "quality": [60, 75, 85, 95] },
    { "format": "webp", "quality": [60, 75, 85, 95] },
    { "format": "avif", "quality": [60, 75, 85, 95], "speed": 4 },
    { "format": "jxl",  "quality": [60, 75, 85, 95] }
  ]
}

Example: parameter sweep with range

{
  "id": "avif-speed-sweep",
  "name": "AVIF Speed Sweep",
  "time_budget": 3600,
  "dataset": { "id": "div2k-valid", "max_images": 10 },
  "encoders": [
    {
      "format": "avif",
      "quality": { "start": 30, "stop": 90, "step": 10 },
      "speed": [2, 4, 6, 8]
    }
  ]
}

Example: crop-impact study

{
  "id": "avif-crop-impact",
  "name": "AVIF Crop Impact Study",
  "dataset": { "id": "div2k-valid", "max_images": 100 },
  "analysis_fragment_size": 200,
  "crop_too_small_strategy": "skip_image",
  "analysis": {
    "x_axis": "crop",
    "group_by": "quality"
  },
  "comparison": {
    "tile_parameter": "crop",
    "targets": [
      { "metric": "ssimulacra2", "values": [65, 70, 75, 80, 85, 90] },
      { "metric": "bits_per_pixel", "values": [0.3, 0.4, 0.8, 1.6, 3.2] }
    ]
  },
  "encoders": [
    {
      "format": "avif",
      "quality": [30, 40, 50, 60, 70, 80, 90, 100],
      "speed": 4,
      "crop": [2048, 1600, 1200, 800, 400]
    }
  ]
}

Quality results schema

Pipeline output is saved as data/metrics/<study-id>/quality.json, validated by config/quality-results.schema.json.

Key fields per measurement:

Field	Description
`source_image`	Path to preprocessed source image
`original_image`	Path to original dataset image
`format`	Encoding format
`quality`	Quality setting used
`resolution`	Resolution sweep value
`crop`	Crop longest-edge sweep value
`analysis_fragment`	Measured fragment rectangle in source-image coordinates
`crop_region`	Crop window rectangle in original-image coordinates
`file_size`	Encoded file size in bytes
`width`, `height`	Image dimensions
`ssimulacra2`	SSIMULACRA2 score (or null on error)
`psnr`	PSNR in dB
`ssim`	SSIM score
`butteraugli`	Butteraugli distance
`encoding_time`	Time taken to encode in seconds
`measurement_error`	Error message (null if successful)

See also

Add a custom dataset — register new datasets in datasets.json
Create a custom study — define new study configurations
Extend formats and metrics — add new encoder parameters, formats, or metrics