Skip to content

comparison

Visual comparison module for identifying and visualizing encoding artifacts.

This module generates side-by-side comparison figures showing how different encoding variants render the same region at matched quality or file-size levels using interpolation-based quality matching.

The workflow is:

  1. Load pipeline quality measurements from quality.json and comparison configuration from the study configuration file. 2. For each target group (e.g., ssimulacra2=[60,70,80]), select the source image with highest cross-format coefficient of variation (CV = std / mean) of the output metric via interpolation. 3. Encode images at interpolated quality settings for every target value in the group, compute Butteraugli distortion maps, then compute a single aggregate anisotropic standard-deviation map across all target values. This yields one fragment region per group. 4. Using the shared fragment, crop every variant at every target value and assemble labeled comparison grids with supplementary figures (distortion-map grids, annotated originals).

The comparison script reads its configuration (targets, tile parameter, excluded images) directly from the study configuration JSON, not from quality.json. This allows tuning comparison parameters without re-running the main pipeline.

This decouples the comparison figure generation from the pipeline: the pipeline is a pure encode-and-measure step, while this module independently selects images and quality settings via interpolation.

This module requires:

  • butteraugli_main (for spatial distortion maps)
  • ImageMagick 7 (montage command for grid assembly)
  • Pillow (for image cropping and nearest-neighbor scaling)
  • Encoding tools (cjpeg, cwebp, avifenc, cjxl) matching the study formats

generate_distortion_map(
original: Path,
compressed: Path,
output_pfm: Path
) → Path

Generate a raw Butteraugli distortion map as a PFM file.

Uses butteraugli_main --rawdistmap to produce a PFM file containing the actual per-pixel float distortion values. The false- colour PNG normally written by --distmap is redirected to a throw-away file inside a temporary directory and discarded.

Args:

  • original: Path to the original reference image.
  • compressed: Path to the compressed/encoded image.
  • output_pfm: Path where the raw PFM distortion map will be written.

Returns: Path to the generated PFM file (same as output_pfm).

Raises:

  • RuntimeError: If butteraugli_main fails.

find_worst_region(pfm_path: Path, crop_size: int = 128) → WorstRegion

Find the region with the highest distortion in a raw Butteraugli PFM file.

Reads the PFM via :func:src.quality.read_pfm and delegates the sliding-window search to :func:src.quality.find_worst_region_in_array.

Args:

  • pfm_path: Path to a .pfm raw distortion map.
  • crop_size: Side length of the square sliding window in pixels.

Returns:

  • :class: WorstRegion with coordinates and average distortion score.

crop_and_zoom(
image_path: Path,
region: WorstRegion,
zoom_factor: int = 3,
output_path: Path | None = None
) → Image

Crop a region from an image and zoom with nearest-neighbor interpolation.

Args:

  • image_path: Path to the image file
  • region: Region to crop
  • zoom_factor: Scale factor (e.g., 3 for 300%)
  • output_path: Optional path to save the result

Returns: PIL Image of the cropped and zoomed region


sort_tile_values(raw: set[str] | list[str]) → list[str]

Sort tile-parameter value strings numerically when possible.

Numeric values (e.g. effort levels ["1", "2", ..., "10"]) are sorted as floats so that "10" comes after "9" rather than after "1" (lexicographic order). Non-numeric strings fall back to plain lexicographic sort.

Args:

  • raw: Collection of tile-parameter value strings.

Returns: Sorted list of value strings.


determine_varying_parameters(measurements: list[dict]) → list[str]

Determine which encoding parameters vary across measurements.

Args:

  • measurements: List of measurement dictionaries

Returns: List of parameter names that have more than one unique value


assemble_comparison_grid(
crops: list[tuple[Path, str, str]],
output_path: Path,
max_columns: int = 4,
label_font_size: int = 22,
figure_title: str | None = None,
placeholder_indices: frozenset[int] | None = None
) → Path

Assemble cropped images into a labeled grid using ImageMagick montage.

Each image is annotated with a label (encoding parameters and metrics) placed below the tile. When figure_title is given, a centred title row is prepended above the grid. Tiles listed in placeholder_indices are rendered with white-on-white invisible labels so they appear as blank spacers, preserving the grid layout when some variants are unavailable at a given quality target.

Args:

  • crops: List of (image_path, title_label, metric_label) tuples.
  • output_path: Path where the grid image will be saved.
  • max_columns: Maximum images per row.
  • label_font_size: Font size for per-tile labels.
  • figure_title: Optional title rendered above the whole grid.
  • placeholder_indices: 0-based indices into crops whose tiles should be rendered as invisible spacers (white image, white text labels).

Returns: Path to the generated comparison grid image.

Raises:

  • RuntimeError: If montage command fails.

encode_image(
source_path: Path,
measurement: dict,
output_dir: Path
) → Path | None

Re-encode a source image using the parameters from a measurement record.

Uses the same encoder tools (cjpeg, cwebp, avifenc, cjxl) as the original study to reproduce the encoded image on the fly.

When the measurement includes a resolution parameter, the source image is resized to that resolution (longest edge) before encoding.

When the measurement includes a crop parameter and analysis_fragment / crop_region, the source image is cropped accordingly before encoding.

Args:

  • source_path: Path to the source image (PNG)
  • measurement: Measurement dictionary with format, quality, and optional encoder parameters (speed, effort, method, chroma_subsampling)
  • output_dir: Directory where the encoded file will be written

Returns: Path to the encoded file, or None if encoding failed


generate_comparison(
quality_json_path: Path,
output_dir: Path,
project_root: Path,
config: ComparisonConfig | None = None
) → ComparisonResult

Generate visual comparison images using interpolation-based matching.

For each target group defined in the study configuration (e.g. ssimulacra2=[60,70,80] or bits_per_pixel=[0.8,2.4,4.0]), this function:

  1. Selects the source image with highest cross-format coefficient of variation (CV = std / mean) of the output metric (via :func:src.interpolation.select_best_image), respecting any image exclusions from the study config. 2. For all target values in the group, interpolates encoder quality per format, encodes, and computes distortion maps. 3. Computes a single aggregate anisotropic standard-deviation map across all target values, yielding one fragment region per group. 4. Using the shared fragment, crops every variant at every target value and assembles labeled comparison grids and supplementary figures (one distortion-map + annotated original per group).

The comparison configuration (targets, tile parameter, excluded images) is read from the study configuration file when config.study_config_path is set.

Args:

  • quality_json_path: Path to the quality.json results file.
  • output_dir: Directory where comparison images will be saved.
  • project_root: Project root directory for resolving relative paths.
  • config: Comparison configuration (uses defaults if None).

Returns:

  • :class: ComparisonResult with per-target results.

Raises:

  • FileNotFoundError: If quality results or source images are not found.
  • RuntimeError: If image processing tools fail.

Configuration for visual comparison generation.

Attributes:

  • crop_size: Size of the crop region in original pixels (before zoom).
  • zoom_factor: Factor to scale the crop (e.g., 3 for 300% zoom).
  • max_columns: Maximum number of images per row in the grid.
  • label_font_size: Font size for labels in the comparison grid.
  • distmap_vmax: Upper bound of the fixed Butteraugli distortion scale used in the distortion-map comparison grid. All per-pixel values are clamped to [0, distmap_vmax] before mapping to the viridis colormap, ensuring every tile uses an identical colour scale so structural differences between encoding variants are directly comparable. Defaults to 5.0.
  • source_image: Optional explicit source image path (relative to project root) to use instead of automatic selection.
  • tile_parameter: The encoding parameter that should vary within each comparison figure — i.e. each tile shows a different value of this parameter.

When None the value is taken from the study configuration file. If that is also absent the built-in

  • heuristic is used: "format" when multiple formats vary, otherwise the first non-quality sweep parameter.
  • study_config_path: Path to the study configuration JSON file. When provided, comparison targets, tile parameter, and excluded images are read from this file instead of from quality.json metadata.

__init__(
crop_size: int = 128,
zoom_factor: int = 3,
max_columns: int = 4,
label_font_size: int = 22,
distmap_vmax: float = 5.0,
source_image: str | None = None,
tile_parameter: str | None = None,
study_config_path: Path | None = None
) → None

Result for one target-value comparison figure.

Attributes:

  • target_metric: The metric being matched (e.g. "ssimulacra2").
  • target_value: The target value (e.g. 70).
  • source_image: Path (relative to project root) of the selected source image.
  • region: The detected worst fragment coordinates.
  • interpolated_qualities: Mapping from format name to the interpolated encoder quality setting used.
  • output_images: List of generated comparison image paths.

__init__(
target_metric: str,
target_value: float,
source_image: str,
region: WorstRegion,
interpolated_qualities: dict[str, float],
output_images: list[Path] = <factory>
) → None

Result of the visual comparison generation.

Attributes:

  • study_id: Study identifier.
  • target_results: List of per-target-value results.
  • varying_parameters: Parameters that vary across measurements.

__init__(
study_id: str,
target_results: list[TargetComparisonResult] = <factory>,
varying_parameters: list[str] = <factory>
) → None