Replacing decades-old zsh function. It’s a vibe.

One of the key defining features of my more serious photography is that everything I publish is the product of some or other kind of image-blending workflow in the name of image quality. Whether it’s HDR, focus-stacking, pixel-shift or panoramas, whichever approach benefits the scene, I will do it.

Coming home after a day out with a few hundred images on the SD-card arising from all these methods requires a lot of organization. I use digikam to sort photos, grouping by time, finding the best version of each scene, etc. Then all the RAW files are converted to 16-bit TIFF with some kind of upscaling applied for super-resolution – currently I favour DxO PhotoLab as its noise-reduction / pixel-detail control is the best I’ve seen, but open-source fans might appreciate either darktable or rawtherapee.

Then comes the processing…

For years now I’ve relied on a crude and simple zsh function to ease the handling of each subdirectory. It’s not elegant but it knows how to apply mean, median, min, max, enfuse, focus-stacking and things to a directory-tree. Well, if anything, it’s got less elegant over time as other ideas have been bolted-on.

It occurred to me I could use one of these new fancy “AI” / LLM tools to try and rewrite the whole process.

Long story short, both ChatGPT and Claude did a great job of banging-out over 550 lines of python, bringing together a whole raft of functionality and improving significantly on what I had been using.

Some image-blending techniques (mean and median) have been brought within python using numpy or other libraries. The HDR and focus-stack algorithms still shell-out to enfuse for the heavy lifting. It implements a much greater selection of alignment algorithms in case one or other fails (a frequent problem when registering hand-held images). It chooses the default blending mode based on the name of the subdirectory. It allows up-scaling (super-resolution) prior to alignment, including two-dimensional upscaling to change aspect ratio (useful for working with anamorphic sources).

All of these things and more mean I can just run it from my RAW-conversion directory and it returns a bunch of blended images for editing and finishing (currently using Luminar Neo, although again, darktable is recommended for the open-source users out there).

So, this was the prompt:


### **Python Script Specification for Image Stack Alignment and Blending**

#### **Purpose**

This script processes image stacks in subdirectories of a base directory. It aligns and blends images using various algorithms and modes, supporting upscaling, multithreaded processing, EXIF preservation, logging, and optional focus stacking or HDR generation.

---

### **Functionality Overview**

#### **1. Directory Discovery**

* Search for subdirectories in the base directory (default: `..`) that match the pattern: `coll-*`
* Process directories in **alphanumerically sorted** order

#### **2. Argument Parsing**

##### `--mode`

Comma-separated list of blending modes. Valid options:

* `mean`: compute per-pixel mean using NumPy/PIL
* `median`: compute per-pixel median using NumPy/PIL
* `min`: compute per-pixel minimum
* `max`: compute per-pixel maximum
* `focus`: run `enfuse` with:
  `-l 29 --hard-mask --saturation-weight=0 --entropy-weight=1 --contrast-weight=1 --exposure-weight=0.4 --gray-projector=l-star --contrast-edge-scale=0.3`
* `hdr`: run `enfuse` with:
  `-d 16 --compression=lzw --exposure-weight=1 --contrast-weight=0.3 --entropy-weight=1 --saturation-weight=0.4`
* `focusstack`: use the `focusstack` Python library if available; silently skip if not
* `auto` *(default)*: determines mode per-directory:

  * `coll-focus-*` → `focus`
  * `coll-hdr-*` → `hdr`
  * otherwise → `mean`
* `all`: run all supported/available modes

##### `--align`

Comma-separated list of alignment algorithms to use. Valid values:

* `akaze`
* `orb`
* `sift`
* `ecc`
* `imreg_dft`
* `pyrecon`
* `auto`: same as default
* `all`: run all **available** and **implemented** aligners
* **Default**: `akaze,imreg_dft`
* Any aligner not available/implemented must be excluded, **not silently skipped or defaulted**

##### `--scale`

* A single float (e.g., `1.5`) → apply to both dimensions
* A comma-separated pair of floats (e.g., `1.2,1.5`) → horizontal and vertical scaling respectively (may change aspect ratio)
* Upscaling uses the **highest available Lanczos interpolation**:

  * Prefer `Lanczos5`, fall back to `Lanczos4`, then `Lanczos3`

##### `--threads`

* Number of threads to use for upscaling (default: `2`)

---

### **Processing Logic**

#### **For Each Matching Directory (`coll-*`):**

1. **Log** entry into the directory with timestamp
2. **Determine mode** (`--mode`) according to the logic above
3. **Identify images** matching `[ID]*.tif` or `[ID]*.tiff`

   * Set `basename` from the first matching image file (sorted)
4. **Loop over alignment algorithms:**

   * Align images
   * Output aligned images to: `/tmp/{basename}/`
5. **Loop over blending modes:**

   * Blend aligned images using selected mode
   * Save as `{mode}-{align}-{basename}.tiff` in the input `coll-*` directory
   * Use **16-bit TIFF** throughout
   * **Preserve full EXIF data** (do not truncate)
   * **Skip** if the output file already exists and is **non-empty**

---

### **Logging**

* Each `coll-*` directory gets its own log file
* All log messages are timestamped
* Log explicitly:

  * Entering the directory
  * Basename determination
  * Each image loaded
  * Upscaling details
  * Starting alignment
  * Alignment steps and filenames
  * Blending steps and filenames

---

### **Cleanup**

* Clean up the temporary directory (`/tmp/{basename}`) after processing each stack

---

### **Notes**

* The script **must handle missing or unimplemented alignment/blending methods gracefully**: if a method is unavailable, it must be excluded from execution
* Ensure that **output TIFFs have correct and complete EXIF metadata**
* **Multithreaded upscaling** should be confined **within each directory** (not across directories)

It has a few dependencies:
sudo apt-get install python3 enfuse
pip3 install -U tifffile opencv-python-headless pillow imagecodecs

And this was the final python script:

#!/usr/bin/env python3

"""
Focus stacking and image alignment script.
Processes coll-* directories with various alignment and blending modes.

Supported modes:
- mean: Mathematical averaging using NumPy
- median: Pixel-wise median using NumPy  
- focus: Focus stacking using enfuse with entropy-based selection
- hdr: HDR blending using enfuse with exposure/contrast weighting
- all: Process all modes
- auto: Focus mode for coll-focus-* directories, mean for others
"""

import argparse
import glob
import logging
import os
import re
import shutil
import subprocess
import sys
import tempfile
from datetime import datetime
from pathlib import Path
from typing import List, Tuple, Optional

import cv2
import numpy as np
from PIL import Image, ExifTags
from PIL.ExifTags import TAGS


def setup_logging(log_file: str) -> logging.Logger:
    """Set up logging with timestamp format."""
    logger = logging.getLogger('focus_stack')
    logger.setLevel(logging.INFO)
    
    # Remove existing handlers
    for handler in logger.handlers[:]:
        logger.removeHandler(handler)
    
    # File handler
    fh = logging.FileHandler(log_file)
    fh.setLevel(logging.INFO)
    
    # Console handler
    ch = logging.StreamHandler()
    ch.setLevel(logging.INFO)
    
    # Formatter with timestamp
    formatter = logging.Formatter('%(asctime)s - %(message)s', 
                                datefmt='%Y-%m-%d %H:%M:%S')
    fh.setFormatter(formatter)
    ch.setFormatter(formatter)
    
    logger.addHandler(fh)
    logger.addHandler(ch)
    
    return logger


def get_image_files(directory: str) -> List[str]:
    """Get all TIFF image files matching pattern [ID]*.tif{,f}."""
    patterns = ['*.tif', '*.tiff', '*.TIF', '*.TIFF']
    files = []
    
    for pattern in patterns:
        files.extend(glob.glob(os.path.join(directory, pattern)))
    
    # Filter for files that start with digits or letters (ID pattern)
    id_files = []
    for f in files:
        basename = os.path.basename(f)
        if re.match(r'^[A-Za-z0-9]', basename):
            id_files.append(f)
    
    return sorted(id_files)


def copy_exif_data(source_path: str, target_path: str) -> None:
    """Copy EXIF data from source image to target image."""
    try:
        # Use exiftool if available (most reliable)
        result = subprocess.run(['exiftool', '-TagsFromFile', source_path, target_path, '-overwrite_original'], 
                              capture_output=True, text=True)
        if result.returncode == 0:
            return
    except FileNotFoundError:
        pass
    
    # Fallback: don't modify the target file at all if PIL approach fails
    try:
        with Image.open(source_path) as source_img:
            exif_dict = source_img.getexif()
            
        if exif_dict:
            # Create a temporary file for the operation
            temp_path = target_path + '.tmp'
            with Image.open(target_path) as target_img:
                target_img.save(temp_path, format='TIFF', exif=exif_dict)
            
            # Only replace original if temp file was created successfully
            if os.path.exists(temp_path) and os.path.getsize(temp_path) > 0:
                os.replace(temp_path, target_path)
            else:
                # Clean up failed temp file
                if os.path.exists(temp_path):
                    os.remove(temp_path)
                    
    except Exception:
        # If anything fails, leave the target file untouched
        pass


def upscale_image(image_path: str, output_path: str, scale: Tuple[float, float], logger: logging.Logger) -> None:
    """Upscale image by given scale factors."""
    logger.info(f"Upscaling {os.path.basename(image_path)} by scale {scale}")
    
    # Read image as 16-bit
    img = cv2.imread(image_path, cv2.IMREAD_UNCHANGED)
    if img is None:
        raise ValueError(f"Could not read image: {image_path}")
    
    # Get original dimensions
    height, width = img.shape[:2]
    new_width = int(width * scale[0])
    new_height = int(height * scale[1])
    
    # Upscale using cubic interpolation for quality
    upscaled = cv2.resize(img, (new_width, new_height), interpolation=cv2.INTER_CUBIC)
    
    # Save as 16-bit TIFF
    cv2.imwrite(output_path, upscaled)


def align_images_akaze(reference_path: str, image_paths: List[str], output_dir: str, logger: logging.Logger) -> List[str]:
    """Align images using AKAZE feature detection."""
    logger.info("Starting AKAZE alignment")
    aligned_paths = []
    
    # Read reference image
    ref_img = cv2.imread(reference_path, cv2.IMREAD_GRAYSCALE)
    detector = cv2.AKAZE_create()
    ref_kp, ref_desc = detector.detectAndCompute(ref_img, None)
    
    for i, img_path in enumerate(image_paths):
        output_path = os.path.join(output_dir, f"aligned_{i:03d}.tiff")
        logger.info(f"Aligning {os.path.basename(img_path)}")
        
        if img_path == reference_path:
            # Copy reference image as-is
            shutil.copy2(img_path, output_path)
        else:
            # Read and align image
            img = cv2.imread(img_path, cv2.IMREAD_UNCHANGED)
            img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) if len(img.shape) == 3 else img
            
            kp, desc = detector.detectAndCompute(img_gray, None)
            
            # Match features
            matcher = cv2.BFMatcher()
            matches = matcher.knnMatch(desc, ref_desc, k=2)
            
            # Filter good matches
            good_matches = []
            for match_pair in matches:
                if len(match_pair) == 2:
                    m, n = match_pair
                    if m.distance < 0.7 * n.distance:
                        good_matches.append(m)
            
            if len(good_matches) >= 10:
                # Find homography
                src_pts = np.float32([kp[m.queryIdx].pt for m in good_matches]).reshape(-1, 1, 2)
                dst_pts = np.float32([ref_kp[m.trainIdx].pt for m in good_matches]).reshape(-1, 1, 2)
                
                M, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC, 5.0)
                
                # Warp image
                h, w = ref_img.shape
                aligned = cv2.warpPerspective(img, M, (w, h))
                cv2.imwrite(output_path, aligned)
            else:
                logger.warning(f"Not enough good matches for {img_path}, copying original")
                shutil.copy2(img_path, output_path)
        
        aligned_paths.append(output_path)
    
    return aligned_paths


def align_images_orb(reference_path: str, image_paths: List[str], output_dir: str, logger: logging.Logger) -> List[str]:
    """Align images using ORB feature detection."""
    logger.info("Starting ORB alignment")
    aligned_paths = []
    
    ref_img = cv2.imread(reference_path, cv2.IMREAD_GRAYSCALE)
    detector = cv2.ORB_create(nfeatures=5000)
    ref_kp, ref_desc = detector.detectAndCompute(ref_img, None)
    
    for i, img_path in enumerate(image_paths):
        output_path = os.path.join(output_dir, f"aligned_{i:03d}.tiff")
        logger.info(f"Aligning {os.path.basename(img_path)}")
        
        if img_path == reference_path:
            shutil.copy2(img_path, output_path)
        else:
            img = cv2.imread(img_path, cv2.IMREAD_UNCHANGED)
            img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) if len(img.shape) == 3 else img
            
            kp, desc = detector.detectAndCompute(img_gray, None)
            
            if desc is not None and ref_desc is not None:
                matcher = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
                matches = matcher.match(desc, ref_desc)
                matches = sorted(matches, key=lambda x: x.distance)
                
                if len(matches) >= 10:
                    src_pts = np.float32([kp[m.queryIdx].pt for m in matches[:50]]).reshape(-1, 1, 2)
                    dst_pts = np.float32([ref_kp[m.trainIdx].pt for m in matches[:50]]).reshape(-1, 1, 2)
                    
                    M, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC, 5.0)
                    
                    if M is not None:
                        h, w = ref_img.shape
                        aligned = cv2.warpPerspective(img, M, (w, h))
                        cv2.imwrite(output_path, aligned)
                    else:
                        shutil.copy2(img_path, output_path)
                else:
                    logger.warning(f"Not enough matches for {img_path}")
                    shutil.copy2(img_path, output_path)
            else:
                shutil.copy2(img_path, output_path)
        
        aligned_paths.append(output_path)
    
    return aligned_paths


def align_images_ecc(reference_path: str, image_paths: List[str], output_dir: str, logger: logging.Logger) -> List[str]:
    """Align images using ECC (Enhanced Correlation Coefficient)."""
    logger.info("Starting ECC alignment")
    aligned_paths = []
    
    ref_img = cv2.imread(reference_path, cv2.IMREAD_GRAYSCALE)
    ref_img = np.float32(ref_img) / 255.0
    
    for i, img_path in enumerate(image_paths):
        output_path = os.path.join(output_dir, f"aligned_{i:03d}.tiff")
        logger.info(f"Aligning {os.path.basename(img_path)}")
        
        if img_path == reference_path:
            shutil.copy2(img_path, output_path)
        else:
            img = cv2.imread(img_path, cv2.IMREAD_UNCHANGED)
            img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) if len(img.shape) == 3 else img
            img_gray = np.float32(img_gray) / 255.0
            
            # Define motion model (affine)
            warp_matrix = np.eye(2, 3, dtype=np.float32)
            
            try:
                (cc, warp_matrix) = cv2.findTransformECC(ref_img, img_gray, warp_matrix, 
                                                       cv2.MOTION_AFFINE,
                                                       criteria=(cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 50, 0.001))
                
                h, w = ref_img.shape
                aligned = cv2.warpAffine(img, warp_matrix, (w, h))
                cv2.imwrite(output_path, aligned)
            except cv2.error as e:
                logger.warning(f"ECC alignment failed for {img_path}: {e}")
                shutil.copy2(img_path, output_path)
        
        aligned_paths.append(output_path)
    
    return aligned_paths


def blend_images_mean(image_paths: List[str], output_path: str, logger: logging.Logger) -> None:
    """Blend images using mean averaging with numpy."""
    logger.info(f"Blending {len(image_paths)} images using mean averaging")
    
    # Read first image to get dimensions and dtype
    first_img = cv2.imread(image_paths[0], cv2.IMREAD_UNCHANGED)
    if first_img is None:
        raise ValueError(f"Could not read first image: {image_paths[0]}")
    
    height, width = first_img.shape[:2]
    channels = first_img.shape[2] if len(first_img.shape) == 3 else 1
    
    # Use float64 for accumulation to prevent overflow
    accumulated = np.zeros((height, width, channels), dtype=np.float64)
    
    # Read and accumulate all images
    valid_count = 0
    for img_path in image_paths:
        logger.info(f"Processing {os.path.basename(img_path)} for mean blend")
        img = cv2.imread(img_path, cv2.IMREAD_UNCHANGED)
        if img is None:
            logger.warning(f"Could not read {img_path}, skipping")
            continue
            
        # Ensure image matches expected dimensions
        if img.shape[:2] != (height, width):
            logger.warning(f"Image {img_path} has different dimensions, skipping")
            continue
            
        # Handle grayscale vs color
        if len(img.shape) == 2 and channels > 1:
            img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
        elif len(img.shape) == 3 and channels == 1:
            img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
            img = np.expand_dims(img, axis=2)
        
        accumulated += img.astype(np.float64)
        valid_count += 1
    
    if valid_count == 0:
        raise ValueError("No valid images found for mean blending")
    
    # Calculate mean and convert back to original dtype
    mean_img = (accumulated / valid_count).astype(first_img.dtype)
    
    # Remove single channel dimension if grayscale
    if channels == 1:
        mean_img = np.squeeze(mean_img, axis=2)
    
    # Save result
    success = cv2.imwrite(output_path, mean_img)
    if not success:
        raise ValueError(f"Failed to save mean blended image to {output_path}")
    
    logger.info(f"Mean blend completed with {valid_count} images")


def blend_images_median(image_paths: List[str], output_path: str, logger: logging.Logger) -> None:
    """Blend images using median with numpy."""
    logger.info(f"Blending {len(image_paths)} images using median")
    
    # Read all images into a list
    images = []
    for img_path in image_paths:
        logger.info(f"Processing {os.path.basename(img_path)} for median blend")
        img = cv2.imread(img_path, cv2.IMREAD_UNCHANGED)
        if img is None:
            logger.warning(f"Could not read {img_path}, skipping")
            continue
        images.append(img)
    
    if not images:
        raise ValueError("No valid images found for median blending")
    
    # Check that all images have the same shape
    first_shape = images[0].shape
    for i, img in enumerate(images[1:], 1):
        if img.shape != first_shape:
            logger.warning(f"Image {image_paths[i]} has different shape, skipping")
            images.pop(i)
    
    if len(images) == 0:
        raise ValueError("No images with matching dimensions found")
    
    # Stack images along new axis and compute median
    stacked = np.stack(images, axis=0)
    median_img = np.median(stacked, axis=0).astype(images[0].dtype)
    
    # Save result
    success = cv2.imwrite(output_path, median_img)
    if not success:
        raise ValueError(f"Failed to save median blended image to {output_path}")
    
    logger.info(f"Median blend completed with {len(images)} images")


def run_enfuse(image_paths: List[str], output_path: str, mode: str, logger: logging.Logger) -> None:
    """Run enfuse with appropriate parameters based on mode."""
    logger.info(f"Running enfuse in {mode} mode with {len(image_paths)} images")
    
    if mode == 'focus':
        cmd = [
            'enfuse',
            '-l', '29',
            '--hard-mask',
            '--saturation-weight=0',
            '--entropy-weight=1',
            '--contrast-weight=1',
            '--exposure-weight=0.4',
            '--gray-projector=l-star',
            '--contrast-edge-scale=0.3',
            '-o', output_path
        ] + image_paths
    elif mode == 'hdr':
        cmd = [
            'enfuse',
            '-d', '16',
            '--compression=lzw',
            '--exposure-weight=1',
            '--contrast-weight=0.3',
            '--entropy-weight=1',
            '--saturation-weight=0.4',
            '-o', output_path
        ] + image_paths
    else:
        raise ValueError(f"Unknown enfuse mode: {mode}")
    
    try:
        result = subprocess.run(cmd, capture_output=True, text=True, check=True)
        logger.info(f"Enfuse completed successfully for {mode} mode")
    except subprocess.CalledProcessError as e:
        logger.error(f"Enfuse failed: {e.stderr}")
        raise
    except FileNotFoundError:
        logger.error("Enfuse not found. Please install enfuse/enblend tools.")
        raise


def process_directory(coll_dir: str, modes: List[str], align_methods: List[str], 
                     scale: Tuple[float, float], logger: logging.Logger) -> None:
    """Process a single coll-* directory."""
    logger.info(f"Processing directory: {coll_dir}")
    
    # Get image files
    image_files = get_image_files(coll_dir)
    if not image_files:
        logger.warning(f"No image files found in {coll_dir}")
        return
    
    # Determine basename from first image
    basename = os.path.splitext(os.path.basename(image_files[0]))[0]
    logger.info(f"Determined basename: {basename}")
    
    # Log each file being loaded
    for img_file in image_files:
        logger.info(f"Loading file: {os.path.basename(img_file)}")
    
    # Create temporary directory for this processing
    with tempfile.TemporaryDirectory(prefix=f"{basename}_", dir="/tmp") as temp_dir:
        logger.info(f"Created temporary directory: {temp_dir}")
        
        # Upscale images if needed
        upscaled_dir = os.path.join(temp_dir, "upscaled")
        os.makedirs(upscaled_dir)
        
        upscaled_files = []
        for i, img_file in enumerate(image_files):
            if scale != (1.0, 1.0):
                upscaled_path = os.path.join(upscaled_dir, f"upscaled_{i:03d}.tiff")
                upscale_image(img_file, upscaled_path, scale, logger)
                upscaled_files.append(upscaled_path)
            else:
                upscaled_files.append(img_file)
        
        # Process each alignment method
        for align_method in align_methods:
            logger.info(f"Starting image registration phase with {align_method}")
            
            align_dir = os.path.join(temp_dir, f"aligned_{align_method}")
            os.makedirs(align_dir)
            
            # Align images
            if align_method == 'akaze':
                aligned_files = align_images_akaze(upscaled_files[0], upscaled_files, align_dir, logger)
            elif align_method == 'orb':
                aligned_files = align_images_orb(upscaled_files[0], upscaled_files, align_dir, logger)
            elif align_method == 'ecc':
                aligned_files = align_images_ecc(upscaled_files[0], upscaled_files, align_dir, logger)
            elif align_method == 'sift':
                # Placeholder - would need OpenCV contrib for SIFT
                logger.warning("SIFT alignment not implemented, using AKAZE")
                aligned_files = align_images_akaze(upscaled_files[0], upscaled_files, align_dir, logger)
            elif align_method in ['imreg_dft', 'pyrecon']:
                # Placeholder - would need additional libraries
                logger.warning(f"{align_method} alignment not implemented, using AKAZE")
                aligned_files = align_images_akaze(upscaled_files[0], upscaled_files, align_dir, logger)
            else:
                logger.error(f"Unknown alignment method: {align_method}")
                continue
            
            # Process each mode
            for mode in modes:
                output_filename = f"{mode}-{align_method}-{basename}.tiff"
                output_path = os.path.join(coll_dir, output_filename)
                
                # Skip if output already exists and is non-empty
                if os.path.exists(output_path) and os.path.getsize(output_path) > 0:
                    logger.info(f"Output file {output_filename} already exists, skipping")
                    continue
                
                logger.info(f"Creating blended image: {output_filename}")
                
                try:
                    if mode == 'mean':
                        blend_images_mean(aligned_files, output_path, logger)
                    elif mode == 'median':
                        blend_images_median(aligned_files, output_path, logger)
                    elif mode in ['focus', 'hdr']:
                        run_enfuse(aligned_files, output_path, mode, logger)
                    else:
                        logger.error(f"Unknown blend mode: {mode}")
                        continue
                    
                    # Copy EXIF data from base image
                    copy_exif_data(image_files[0], output_path)
                    
                    logger.info(f"Successfully created: {output_filename}")
                    
                except Exception as e:
                    logger.error(f"Failed to create {output_filename}: {e}")


def main():
    parser = argparse.ArgumentParser(description='Focus stacking and image alignment tool')
    parser.add_argument('--base-dir', default='.', help='Base directory to search for coll-* subdirectories')
    parser.add_argument('--mode', default='auto', 
                       help='Blending modes: mean,median,focus,hdr,all,auto (default: auto)')
    parser.add_argument('--align', default='akaze,imreg_dft',
                       help='Alignment methods: akaze,orb,sift,ecc,imreg_dft,pyrecon,all,auto (default: akaze,imreg_dft)')
    parser.add_argument('--scale', default='1.0',
                       help='Scale factor: single float or comma-separated pair (default: 1.0)')
    
    args = parser.parse_args()
    
    # Parse scale argument
    scale_parts = args.scale.split(',')
    if len(scale_parts) == 1:
        scale_x = scale_y = float(scale_parts[0])
    elif len(scale_parts) == 2:
        scale_x, scale_y = float(scale_parts[0]), float(scale_parts[1])
    else:
        raise ValueError("Scale must be a single float or comma-separated pair")
    
    scale = (scale_x, scale_y)
    
    # Find coll-* directories
    coll_pattern = os.path.join(args.base_dir, 'coll-*')
    coll_dirs = sorted(glob.glob(coll_pattern))
    
    if not coll_dirs:
        print(f"No coll-* directories found in {args.base_dir}")
        return
    
    # Process each directory
    for coll_dir in coll_dirs:
        if not os.path.isdir(coll_dir):
            continue
            
        # Set up logging for this directory
        log_file = os.path.join(coll_dir, 'processing.log')
        logger = setup_logging(log_file)
        
        # Determine modes for this directory
        dir_name = os.path.basename(coll_dir)
        if args.mode == 'auto':
            if 'focus' in dir_name:
                modes = ['focus']
            else:
                modes = ['mean']
        elif args.mode == 'all':
            modes = ['mean', 'median', 'focus', 'hdr']
        else:
            modes = [m.strip() for m in args.mode.split(',')]
        
        # Determine alignment methods
        if args.align == 'all':
            align_methods = ['akaze', 'orb', 'sift', 'ecc', 'imreg_dft', 'pyrecon']
        elif args.align == 'auto':
            align_methods = ['akaze', 'imreg_dft']
        else:
            align_methods = [m.strip() for m in args.align.split(',')]
        
        try:
            process_directory(coll_dir, modes, align_methods, scale, logger)
        except Exception as e:
            logger.error(f"Failed to process directory {coll_dir}: {e}")
            continue


if __name__ == '__main__':
    main()

Yes, I even used chatGPT to edit the prompt text, refactoring all the requirements.

If you like it, feel free to use and enjoy 🙂

Knowing Your Lenses

Analysing Lens Performance

Background

Typically, when a manufacturer produces new lenses, they include an MTF chart to demonstrate the lens’s performance – sharpness measured as the number of line-pairs per millimetre it can resolve (to a given contrast tolerance), radially and sagitally, varying with distance from the centre of the image. While this might be useful before purchasing a lens, it does not make for easy comparisons (what if another manufacturer uses a different contrast tolerance?) nor does it necessarily well reflect the use to which it’ll be put in practice. (What if they quote a zoom’s performance at 28mm and 70mm f/5.6, but you shoot it most at 35mm f/8? Is the difference between radial and sagittal sharpness useful for a real-world scene?)
Further, if your lens has been around in the kit-bag a bit, is its performance still up to scratch or has it gone soft, with internal elements slightly out of alignment? When you’re out in the field, does it matter whether you use 50mm in the middle of a zoom or a fixed focal-length prime 50mm instead?
If you’re shooting landscape, would it be better to hand-hold at f/5.6 or stop-down to f/16 for depth of field, use a tripod and risk losing sharpness to diffraction?
Does that blob of dust on the front element really make a difference?
How bad is the vignetting when used wide-open?

Here’s a fairly quick experiment to measure and compare lenses’ performance by aperture, two ways. First, we design a test-chart. It’s most useful if the pattern is even across the frame, reflecting a mixture of scales of detail – thick and thin lines – at a variety of angles – at least perpendicular and maybe crazy pseudo-random designs. Here’s a couple of ideas:

Method

Decide which lenses, at which apertures, you want to profile. Make your own design, print it out at least A4 size, and affix it to a wall. We want the lighting to be as even as possible, so ideally use indoor artificial light after dark (ie this is a good project for a dark evening). Carefully, set up the camera on a tripod facing the chart square-on, and move close enough so the test-chart is just filling the frame.

Camera settings: use the lowest ISO setting possible (normally around 100), to minimize sensor noise. Use aperture-priority mode so it chooses the shutter-speed itself and fix the white-balance to counteract the indoor lighting. (For a daylight lightbulb, use daylight; otherwise, fluorescent or tungsten. Avoid auto-white-balance.)
Either use a remote-release cable or wireless trigger, or enable a 2-second self-timer mode to allow shaking to die down after pushing the shutter. Assuming the paper with the test-chart is still mostly white, use exposure-compensation (+2/3 EV).
For each lens and focal-length, start with the widest aperture and close-down by a third or half a stop, taking two photos at each aperture.
Open a small text-document and list each lens and focal-length in order as you go, along with any other salient features. (For example, with old manual prime lenses, note the start and final apertures and the interval size – may be half-stops, may be third of a stop.)
Use a RAW converter to process all the images identically: auto-exposure, fixed white-balance, and disable any sharpening, noise-reduction and rescaling you might ordinarily do. Output to 16-bit TIFF files.

Now, it is a given that a reasonable measure of pixel-level sharpness in an image, or part thereof, is its standard deviation. We can use the following Python script (requires numpy and OpenCV modules) to load image(s) and output the standard deviations of subsets of the images:

#!/usr/bin/env python

import sys, glob, cv, cv2
import numpy as np

def imageContrasts(fname):
  img = cv2.imread(fname, cv.CV_LOAD_IMAGE_GRAYSCALE)
  a=np.asarray(img)
  width=len(img[0])
  height=len(img[1])
  half=a[0:height//3, 0:width//3]
  corner=a[0:height//8, 0:width//8]
  return np.std(a), np.std(half), np.std(corner)

def main():
	files=sys.argv[1:]
	if len(files)==0:
		files=glob.glob("*.jpg")
	for f in files:
		co,cc,cf=imageContrasts(f)
		print "%s contrast %f %f %f" % (f, co, cc, cf)

if __name__=="__main__":
	main()

We can build a spreadsheet listing the files, their apertures and sharpnesses overall and in the corners, where vignetting typically occurs. We can easily make a CSV file by looping the above script and some exiftool magic across all the output TIFF files:

bash$ for f in *.tif
do
ap=$(exiftool $f |awk '/^Aperture/ {print $NF}' )
speed=$( exiftool $f |awk '/^Shutter Speed/ {print $NF}' )
conts=$(~/python/image-interpolation/image-sharpness-cv.py $f | sed 's/ /,/g')
echo $f,$ap,=$speed,$conts
done

Typical output might look like:

P1440770.tif,,=1/20,P1440770.tif,contrast,29.235214,29.918323,22.694936
P1440771.tif,,=1/20,P1440771.tif,contrast,29.253372,29.943765,22.739748
P1440772.tif,,=1/15,P1440772.tif,contrast,29.572350,30.566767,25.006098
P1440773.tif,,=1/15,P1440773.tif,contrast,29.513443,30.529055,24.942437

Note the extra `=’ signs; on opening this in LibreOffice Spreadsheet, the formulae will be evaluated and fractions converted to floating-point values in seconds instead. Remove the spurious `contrast’ column and add a header column (fname,aperture,speed,fname,overall,half,corner).

Analysis

Let’s draw some graphs. If you wish to stay with the spreadsheet, use a pivot table to average the half-image contrast values per aperture per lens and work off that. Alternatively, a bit of interactive R can lead to some very pretty graphs:

> install.package(ggplot2)
> library(ggplot2)
> data<-read.csv("sharpness.csv") 
> aggs<-aggregate(cbind(speed,entire,third,corner) ~lens+aperture, data, FUN=mean)
> qplot(aggs$aperture, aggs$third, col=aggs$lens, data=aggs, asp=.5)+geom_smooth()

This will give a comparison of overall sharpness by aperture, grouped by lens. Typically we expect every lens to have a sweetspot aperture at which it is sharpest; my own examples are no exception: third There are 5 lenses at play here: a kit 14-42mm zoom, measured at both 30mm and 42mm; a Minolta Rokkor 55mm prime; a Pentacon 30mm prime; and two Pentacon 50mm f/1.8 prime lenses, one brand new off eBay and one that’s been in the bag for 4 years.

The old Pentacon 50mm was sharpest at f/5.6 but is now the second-lowest at almost every aperture – we’ll come back to this. The new Pentacon 50mm is sharpest of all the lenses from f/5 onwards, peaking at around f/7.1. The kit zoom lens is obviously designed to be used around f/8; the Pentacon 30mm prime is ludicrously unsharp at all apertures – given a choice of kit zoom at 30mm or prime, it would have to be the unconventional choice every time. And the odd one out, the Rokkor 55mm, peaks at a mere f/4.

How about the drop-off, the factor by which the extreme corners are less sharp than the overall image? Again, a quick calculation and plot in R shows:

> aggs$dropoff < - aggs$corner / aggs$third
> qplot(aggs$aperture, aggs$dropoff, col=aggs$lens, data=aggs, asp=.5)+geom_smooth()

dropoff Some observations:

  • all the lenses show a similar pattern, worst vignetting at widest apertures, peaking somewhere in the middle and then attenuating slightly;
  • there is a drastic difference between the kit zoom lens at 30mm (among the best performances) and 42mm (the worst, by far);
  • the old Pentacon 50mm lens had best drop-off around f/8;
  • the new Pentacon 50mm has least drop-of at around f/11;
  • the Minolta Rokkor 55mm peaks at f/4 again.

So, why do the old and new Pentacon 50mm lenses differ so badly? Let’s conclude by examining the shutter-speeds; by allowing the camera to automate the exposure in aperture-priority mode, whilst keeping the scene and its illumination constant, we can plot a graph showing each lens’s transmission against aperture.

> qplot(aggs$aperture, log2(aggs$speed/min(aggs$speed)), col=aggs$lens, data=aggs, asp=.5)+geom_smooth()

speed Here we see the new Pentacon 50mm lens seems to require the least increase in shutter-speed per stop aperture, while, above around f/7.1, the old Pentacon 50mm lens requires the greatest – rocketing off at a different gradient to everything else, such that by f/11 it’s fully 2 stops slower than its replacements.

There’s a reason for this: sadly, the anti-reflective coating is starting to come off in the centre of the front element of the old lens, in the form of a rough circle approximately 5mm diameter. At around f/10, the shutter iris itself is 5mm, so this artifact becomes a significant contributor to the overall exposure.

Conclusion

With the above data, informed choices can be made as to which lens and aperture suit particular scenes. When shooting a wide-angle landscape, I can use the kit zoom at f/8; for nature and woodland closeup work where the close-focussing distance allows, the Minolta Rokkor 55mm at f/4 is ideal.