Run the pipeline
Run a study
Section titled “Run a study”The pipeline encodes images and measures quality metrics in a single pass. Both a study ID and a time budget are required:
just pipeline format-comparison 30mThis processes images one at a time — encoding all configured format variants and measuring quality metrics for each — until the time budget runs out.
For crop-impact studies, the pipeline first selects a fixed worst-case analysis fragment in the original image, then generates crop windows of different sizes around that fragment and measures quality only inside the fragment. This keeps the compared content identical across crop levels.
Results are saved to data/metrics/<study-id>/quality.json.
Time budget format
Section titled “Time budget format”| Example | Meaning |
|---|---|
30m | 30 minutes |
2h | 2 hours |
1h30m | 1 hour 30 minutes |
3600 | 3600 seconds |
90s | 90 seconds |
Available studies
Section titled “Available studies”These study configurations ship in config/studies/:
| Study ID | Description |
|---|---|
format-comparison | Compare JPEG, WebP, AVIF, JPEG XL |
avif-speed-sweep | AVIF speed parameter sweep |
avif-chroma-subsampling | AVIF chroma subsampling comparison |
jxl-effort-sweep | JPEG XL effort level comparison |
webp-method-sweep | WebP method parameter sweep |
resolution-impact | Impact of image resolution on quality |
jpeg-crop-impact | JPEG crop-impact study |
webp-crop-impact | WebP crop-impact study |
avif-crop-impact | AVIF crop-impact study |
jxl-crop-impact | JPEG XL crop-impact study |
Prerequisites
Section titled “Prerequisites”Ensure the study’s dataset is downloaded first:
just fetch div2k-validAdvanced options via the script
Section titled “Advanced options via the script”For additional control, use the script directly:
# Dry run — preview what would run without executingpython3 scripts/run_pipeline.py format-comparison --time-budget 30m --dry-run
# Save encoded artifacts to disk (normally discarded after measurement)python3 scripts/run_pipeline.py format-comparison --time-budget 1h --save-artifacts
# Control parallelismpython3 scripts/run_pipeline.py format-comparison --time-budget 30m --workers 8Resolution vs crop studies
Section titled “Resolution vs crop studies”resolutionsweeps downscale the entire image before encoding. They are useful when you want to study true lower-resolution inputs, but the measured content and the behaviour of quality metrics both change with resolution.cropsweeps keep the selected analysis fragment at full resolution and only reduce the surrounding image area. They are useful when you want to isolate the impact of image area and bit allocation without introducing resampling effects.- Crop-impact studies store two extra metadata fields per measurement:
analysis_fragmentfor the measured fragment coordinates inside the encoded source, andcrop_regionfor the crop window in original-image coordinates.
Output structure
Section titled “Output structure”data/metrics/<study-id>/└── quality.json # All quality measurementsWhen the study uses preprocessing, the pipeline also records source-image labels
such as data/preprocessed/<study-id>/r1280/... for resized inputs or
data/preprocessed/<study-id>/c800/... for cropped inputs.
With --save-artifacts, encoded images are also saved:
data/encoded/<study-id>/├── jpeg/ # Encoded JPEG files├── webp/ # Encoded WebP files├── avif/ # Encoded AVIF files└── jxl/ # Encoded JXL filesClean study data
Section titled “Clean study data”Remove all generated data for a specific study:
just clean-study format-comparisonRemove all study data (preserves datasets):
just clean-studiesTypical workflow
Section titled “Typical workflow”just fetch div2k-valid # 1. Get imagesjust pipeline format-comparison 30m # 2. Encode + measurejust analyze format-comparison # 3. Generate plotsjust compare format-comparison # 4. Visual comparisonsjust report # 5. Interactive reportjust serve-report # 6. View in browserFor a crop-impact study, swap in one of the *-crop-impact study IDs.
See also
Section titled “See also”- Fetch datasets — download source images
- Analyze results — generate statistics and plots
- Generate comparisons — visual comparison images
- Configuration reference — study config schema
- Architecture — pipeline design rationale