Welcome

Web Image Formats Research

Research project for determining optimal modern image formats for the web, with a focus on AVIF and comparative analysis against JPEG, WebP, and JPEG XL.

The project now supports both classic resolution-impact studies and crop-impact studies. Crop-impact studies keep the measured analysis fragment at full resolution while varying the surrounding image area, which avoids the metric and comparability problems introduced by downscaling.

Goals

Identify optimal quality settings for AVIF encoding
Determine when to use chroma subsampling based on quality targets
Compare AVIF effectiveness against JPEG, WebP, and JPEG XL
Analyze compression efficiency using perceptual quality metrics
Measure how input area affects encoder behaviour without resampling confounds

Key Findings

SSIMULACRA2 — AVIF and JPEG XL deliver comparable perceptual quality at significantly lower file sizes than WebP and JPEG:

Format comparison — SSIMULACRA2 vs bits per pixel

Butteraugli — under this distortion model JPEG XL consistently edges ahead of AVIF, highlighting how metric choice affects codec rankings:

Format comparison — Butteraugli vs bits per pixel

See the full interactive report for all studies and metrics.

Quick Start

Open this repository in VS Code
Reopen in the dev container (VS Code will prompt)

Verify everything works:

just verify-tools
just check  # Runs lint, typecheck, and tests

See the Getting Started tutorial for detailed setup instructions.

Project Structure

├── src/                     # Source code modules
│   ├── study.py             # Study configuration loading
│   ├── dataset.py           # Dataset fetching and management
│   ├── preprocessing.py     # Image preprocessing (resize, crop, convert)
│   ├── encoder.py           # Format encoding (JPEG, WebP, AVIF, JXL)
│   ├── quality.py           # Quality measurement (SSIMULACRA2, PSNR, SSIM, Butteraugli)
│   ├── pipeline.py          # Unified pipeline with time-budget control
│   ├── analysis.py          # Statistical analysis and static plotting
│   ├── interactive.py       # Interactive Plotly visualizations
│   ├── comparison.py        # Visual comparison image generation
│   └── report_images.py     # Responsive image optimization for reports
├── scripts/                 # CLI entry points for each workflow step
├── config/                  # Configuration files (datasets, studies)
│   └── studies/             # Per-study JSON configurations
├── data/                    # All research data (git-ignored)
│   ├── datasets/            # Raw image datasets
│   ├── preprocessed/        # Preprocessed images (resized or cropped)
│   ├── encoded/             # Encoded images (JPEG, WebP, AVIF, JXL)
│   ├── metrics/             # Quality measurements (JSON)
│   ├── analysis/            # Analysis outputs (CSV, SVG plots)
│   └── report/              # Generated HTML reports
├── docs/                    # Documentation (Diátaxis framework)
├── .devcontainer/           # Dev container configuration
├── .github/workflows/       # CI pipeline
├── pyproject.toml           # Project configuration and dependencies
└── justfile                 # Development task runner

Development Commands

# Development
just install-dev   # Install all dependencies (dev + production)
just check         # Run all quality checks (lint + typecheck + test)
just test          # Run tests
just lint          # Check code style
just lint-fix      # Fix auto-fixable lint issues
just format        # Format code and markdown
just typecheck     # Run type checking
just verify-tools  # Verify encoding and measurement tools

# Study Workflow
just fetch <dataset-id>              # Fetch a dataset (e.g., div2k-valid, liu4k-v1-valid)
just pipeline <study-id> <time>      # Run unified encode+measure pipeline (e.g., 30m, 1h)
just analyze <study-id>              # Analyze results and generate plots
just compare <study-id>              # Generate visual comparison images
just report                          # Generate interactive HTML report for all studies
just serve-report [port]             # Serve report locally (default: http://localhost:8000)

# Example crop-impact workflow
just pipeline avif-crop-impact 30m
just analyze avif-crop-impact
just compare avif-crop-impact

# Release
just release-notes                   # Generate release notes from study results
just release-assets                  # Prepare release assets (zip + CSV files)

# Cleanup
just clean                           # Clean Python cache and build artifacts
just clean-study <study-id>          # Remove all data for a specific study
just clean-studies                   # Remove all study data (preserves datasets)

# Documentation
just docs-generate                   # Generate docs from source files and Python docstrings
just docs-install                    # Install documentation site dependencies
just docs-dev                        # Start documentation dev server (http://localhost:4321)
just docs-build                      # Build optimized documentation site
just docs-preview                    # Preview built documentation

Documentation

This project follows the Diátaxis documentation framework:

Tutorials — Step-by-step guides to get started
How-to guides — Solutions for specific tasks
Reference — Technical details and API descriptions
Explanation — Background concepts and design decisions

Building Documentation

The documentation is generated using Astro Starlight for optimal performance:

# Generate and preview locally
just docs-dev

# Build for production
just docs-build

The documentation includes automatically generated API reference from Python docstrings.

License

See LICENSE.