Adding a New Analysis Module

This guide provides step-by-step instructions for creating a new analysis module in Granny. Analysis modules are pluggable algorithms that process fruit images and generate quality ratings or measurements.

Overview

An analysis module in Granny:

  • Inherits from the Analysis abstract base class

  • Defines input parameters using the Value system

  • Implements three abstract methods: _preRun(), _processImage(), and _postRun()

  • Returns a list of processed Image objects

  • Automatically integrates with all Granny interfaces (CLI, GUI)

  • Leverages built-in multiprocessing for parallel image processing

  • Can be chained with other analyses using the Scheduler

The Analysis Architecture

The base Analysis class provides a performAnalysis() method that:

  1. Loads images from the input directory via ImageListValue

  2. Calls your _preRun() method for setup

  3. Processes images in parallel using multiprocessing.Pool

  4. Calls your _postRun() method for post-processing and saving results

You implement the three abstract methods to customize behavior:

  • _preRun(): Setup before processing (initialize variables, load models, etc.)

  • _processImage(image): Process a single image (runs in parallel across CPU cores)

  • _postRun(results): Post-processing after all images are done (save CSV, cleanup)

The Value System

Granny uses a type-safe Value system for parameters. Each parameter is represented by a Value object that provides:

  • Type checking and validation

  • Automatic CLI argument generation

  • Default values and required/optional flags

  • Help text for users

  • Min/max constraints for numeric values

Available Value Types:

  • IntValue - Integer parameters (e.g., thresholds, kernel sizes)

  • FloatValue - Floating-point parameters (e.g., confidence scores, alpha values)

  • StringValue - String parameters (e.g., model names, labels)

  • BoolValue - Boolean flags

  • FileNameValue - File paths

  • FileDirValue - Directory paths

  • ImageListValue - Directory containing images (handles loading/saving)

  • MetaDataValue - Metadata storage for results

Step-by-Step Guide

Step 1: Create Your Analysis File

Create a new Python file in Granny/Analyses/ with a descriptive name:

touch Granny/Analyses/MyNewAnalysis.py

Step 2: Import Required Modules

Start your file with necessary imports:

"""
Brief description of what this analysis does.

Author: Your Name
Date: YYYY-MM-DD
"""

import os
from datetime import datetime
from typing import Dict, List, Tuple

import cv2
import numpy as np
from numpy.typing import NDArray

from Granny.Analyses.Analysis import Analysis
from Granny.Models.Images.Image import Image
from Granny.Models.Images.RGBImage import RGBImage
from Granny.Models.IO.RGBImageFile import RGBImageFile
from Granny.Models.Values.IntValue import IntValue
from Granny.Models.Values.FloatValue import FloatValue
from Granny.Models.Values.StringValue import StringValue
from Granny.Models.Values.ImageListValue import ImageListValue
from Granny.Models.Values.MetaDataValue import MetaDataValue

Step 3: Define Your Analysis Class

Create your class inheriting from Analysis:

class MyNewAnalysis(Analysis):
    """
    Detailed description of your analysis.

    This analysis processes fruit images to [describe what it does].

    Attributes:
        images (List[Image]): List of loaded images for processing
        input_images (ImageListValue): Input directory parameter
        output_images (ImageListValue): Output directory parameter
        my_threshold (IntValue): Example threshold parameter
    """

    __analysis_name__ = "myanalysis"  # Used in CLI: --analysis myanalysis

Step 4: Implement the Constructor

Initialize your analysis with parameters:

def __init__(self):
    """
    Initialize the analysis with default parameters.
    """
    super().__init__()

    self.images: List[Image] = []
    self.results_data: List[dict] = []  # For collecting CSV data

    # Required: Input images parameter
    self.input_images = ImageListValue(
        "input",                                    # Machine-readable name
        "input",                                    # CLI argument name (--input)
        "The directory where input images are located."  # Help text
    )
    self.input_images.setIsRequired(True)          # Make it required
    self.addInParam(self.input_images)             # Register as input parameter

    # Output directory for analyzed images
    self.output_images = ImageListValue(
        "output",
        "output",
        "The output directory where analyzed images are written."
    )
    result_dir = os.path.join(
        os.curdir,
        "results",
        self.__analysis_name__,
        datetime.now().strftime("%Y-%m-%d-%H-%M")
    )
    self.output_images.setValue(result_dir)        # Set default value
    self.addInParam(self.output_images)

    # Analysis parameter: threshold
    self.my_threshold = IntValue(
        "threshold",                                # Machine-readable name
        "threshold",                                # CLI argument (--threshold)
        "Threshold value for detection. Range 0-255, default 128."
    )
    self.my_threshold.setMin(0)                    # Set constraints
    self.my_threshold.setMax(255)
    self.my_threshold.setValue(128)                # Set default
    self.my_threshold.setIsRequired(False)         # Make it optional
    self.addInParam(self.my_threshold)             # Register parameter

    # Visualization parameter: mask alpha
    self.mask_alpha = FloatValue(
        "alpha",
        "mask_alpha",
        "Transparency of mask overlay (0.0-1.0, default 0.5)."
    )
    self.mask_alpha.setMin(0.0)
    self.mask_alpha.setMax(1.0)
    self.mask_alpha.setValue(0.5)
    self.mask_alpha.setIsRequired(False)
    self.addInParam(self.mask_alpha)

Important Notes:

  • The first argument (name) is the machine-readable identifier

  • The second argument (label) becomes the CLI argument: --label

  • Always call super().__init__() first to initialize the base class

  • Use addInParam() for parameters that users can set

  • Use addRetValue() for values that other analyses can use

  • Set setIsRequired(True) for mandatory parameters

Step 5: Implement the Three Abstract Methods

Instead of overriding performAnalysis(), implement these three methods:

def _preRun(self):
    """
    Setup before image processing begins.

    This method is called once before any images are processed.
    Use it to:
    - Initialize result containers
    - Load models or resources
    - Print analysis parameters
    """
    self.results_data = []  # Reset results for this run

    # Get output directory
    self.output_dir = self.output_images.getValue()

    # Print analysis info
    print(f"\n{'='*60}")
    print(f"MY NEW ANALYSIS")
    print(f"{'='*60}")
    print(f"Input directory:  {self.input_images.getValue()}")
    print(f"Output directory: {self.output_dir}")
    print(f"Threshold: {self.my_threshold.getValue()}")
    print(f"Mask alpha: {self.mask_alpha.getValue()}")
    print(f"Processing {len(self.images)} images...")
    print(f"{'='*60}\n")

def _processImage(self, image: Image) -> Image:
    """
    Process a single image.

    This method runs in parallel across multiple CPU cores.
    Each call receives one Image instance and should return a processed Image.

    Args:
        image: The input Image instance to process

    Returns:
        Image: The processed Image with results
    """
    # Get parameter values
    threshold = self.my_threshold.getValue()
    alpha = self.mask_alpha.getValue()

    # Load the image data
    image_io = RGBImageFile()
    image_io.setFilePath(image.getFilePath())
    image.loadImage(image_io)

    # Get the numpy array (BGR format from OpenCV)
    img_array = image.getImage()

    # Perform your analysis
    result_array, metric_value = self._analyze_image(img_array, threshold, alpha)

    # Update the image with the result
    image.setImage(result_array)

    # Add metadata to the image
    metric_val = StringValue("metric", "metric", "Analysis metric value")
    metric_val.setValue(str(metric_value))
    image.addValue(metric_val)

    return image

def _postRun(self, results: List[Image]) -> List[Image]:
    """
    Post-processing after all images are processed.

    This method is called once after all images have been processed.
    Use it to:
    - Save images to disk
    - Generate CSV reports
    - Print summary statistics

    Args:
        results: List of processed Image objects from _processImage()

    Returns:
        List[Image]: The final list of result images
    """
    print(f"\nSaving {len(results)} images to: {self.output_dir}")

    # Save each image
    image_io = RGBImageFile()
    for image in results:
        image.saveImage(image_io, self.output_dir)

    # Collect data for CSV
    csv_data = []
    for image in results:
        metadata = image.getMetaData()
        csv_data.append({
            "filename": image.getImageName(),
            "metric": metadata.get("metric", StringValue("", "", "")).getValue()
        })

    # Save CSV
    self._save_csv(csv_data, self.output_dir)

    print(f"\n{'='*60}")
    print(f"Analysis complete! Processed {len(results)} images.")
    print(f"{'='*60}\n")

    return results

def _analyze_image(self, img: NDArray, threshold: int, alpha: float) -> Tuple[NDArray, float]:
    """
    Perform analysis on a single image array.

    Args:
        img: Input image as numpy array (BGR format)
        threshold: Detection threshold
        alpha: Mask transparency

    Returns:
        Tuple of (processed_image, metric_value)
    """
    # Convert to grayscale
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    # Apply threshold
    _, binary = cv2.threshold(gray, threshold, 255, cv2.THRESH_BINARY)

    # Calculate metric (e.g., percentage of pixels above threshold)
    metric = (np.sum(binary == 255) / binary.size) * 100

    # Create visualization
    mask_color = np.zeros_like(img)
    mask_color[binary == 255] = [0, 255, 0]  # Green overlay

    # Blend with original
    result = cv2.addWeighted(img, 1.0, mask_color, alpha, 0)

    # Add text annotation
    text = f"Metric: {metric:.2f}%"
    cv2.putText(result, text, (20, 50), cv2.FONT_HERSHEY_SIMPLEX,
                1.0, (255, 255, 255), 2, cv2.LINE_AA)

    return result, metric

def _save_csv(self, data: List[dict], output_dir: str):
    """
    Save analysis results to CSV file.

    Args:
        data: List of dictionaries containing results
        output_dir: Directory to save CSV
    """
    import csv

    os.makedirs(output_dir, exist_ok=True)
    csv_path = os.path.join(output_dir, f"{self.__analysis_name__}_results.csv")

    if not data:
        return

    with open(csv_path, 'w', newline='') as f:
        writer = csv.DictWriter(f, fieldnames=data[0].keys())
        writer.writeheader()
        writer.writerows(data)

    print(f"Results saved to: {csv_path}")

Step 6: Update the CLI Interface

Edit Granny/Interfaces/UI/GrannyCLI.py to add your analysis to the choices list:

iface_grp.add_argument(
    "--analysis",
    dest="analysis",
    type=str,
    required=True,
    choices=["segmentation", "blush", "color", "scald", "starch", "myanalysis"],  # Add here
    help="Indicates the analysis to run.",
)

Also import your analysis class at the top of the file:

from Granny.Analyses.MyNewAnalysis import MyNewAnalysis

Step 7: Create Tests

Create a test file tests/test_Analyses/test_MyNewAnalysis.py:

from Granny.Analyses.MyNewAnalysis import MyNewAnalysis

def test_init():
    """Test that the analysis initializes correctly."""
    analysis = MyNewAnalysis()
    assert analysis.__analysis_name__ == "myanalysis"
    assert analysis.my_threshold.getValue() == 128

def test_parameters():
    """Test parameter constraints."""
    analysis = MyNewAnalysis()

    # Test threshold bounds
    assert analysis.my_threshold.getMin() == 0
    assert analysis.my_threshold.getMax() == 255

    # Test alpha bounds
    assert analysis.mask_alpha.getMin() == 0.0
    assert analysis.mask_alpha.getMax() == 1.0

Step 8: Test Your Analysis

Run your analysis from the command line:

granny -i cli --analysis myanalysis --input ./demo/images/ --threshold 150 --mask_alpha 0.7

Run the test suite:

python -m pytest tests/test_Analyses/test_MyNewAnalysis.py -v

Best Practices

Multiprocessing Considerations

Since _processImage() runs in parallel:

  • Don’t modify shared state (use return values instead)

  • Each image should be processed independently

  • Heavy initialization belongs in _preRun()

  • Aggregation and saving belongs in _postRun()

Parameter Naming

  • Use clear, descriptive names for parameters

  • Distinguish between analysis parameters (affect results) and visualization parameters (affect output only)

  • Use consistent naming across similar parameters (e.g., blur_kernel, morph_kernel)

Documentation

  • Provide comprehensive docstrings for the class and all methods

  • Document parameter ranges, defaults, and units

  • Include examples in the module docstring

  • Explain the scientific/algorithmic basis of your analysis

Error Handling

  • Validate input parameters in _preRun()

  • Handle cases where no images are found

  • Provide clear error messages to users

  • Gracefully handle edge cases (empty images, invalid formats)

Result Storage

  • Always save both visual results (images) and numerical results (CSV)

  • Include metadata in output images (timestamps, parameters, analysis ID)

  • Use consistent directory structures for results

  • Generate timestamped result directories to avoid overwriting

Advanced Topics

Chaining Analyses

Your analysis can depend on outputs from other analyses using the compatibility system:

def __init__(self):
    super().__init__()

    # Define compatibility with segmentation analysis
    self.compatibility = {
        "segmentation": {
            "input": "output"  # Map our input to segmentation's output
        }
    }

This allows the Scheduler to automatically chain analyses together.

Return Values for Other Analyses

If your analysis produces values that other analyses might use:

# In __init__
self.fruit_coordinates = MetaDataValue(
    "coordinates",
    "coordinates",
    "Bounding box coordinates of detected fruit"
)
self.addRetValue(self.fruit_coordinates)

# In _postRun()
self.fruit_coordinates.setValue([(x1, y1, x2, y2), ...])

CPU Core Configuration

The base Analysis class automatically handles CPU core allocation:

  • Default (0): Uses 80% of available cores

  • User can override via --cpu N CLI argument

  • Set self.cpu.setValue(4) in __init__ to change default

Working with Image Metadata

Add metadata to images using Value objects:

from Granny.Models.Values.StringValue import StringValue
from Granny.Models.Values.FloatValue import FloatValue

# In _processImage()
score_val = FloatValue("score", "score", "Analysis score")
score_val.setValue(95.5)
image.addValue(score_val)

# In _postRun(), retrieve metadata
for image in results:
    metadata = image.getMetaData()
    score = metadata.get("score").getValue()

Troubleshooting

Analysis not appearing in CLI:

  • Verify __analysis_name__ is set correctly

  • Check that you added the analysis to GrannyCLI’s choices list

Parameters not showing in help:

  • Verify you called addInParam() for each parameter

  • Check that the parameter’s label doesn’t conflict with existing parameters

  • Ensure you’re setting parameters before calling addInParam()

Images not loading:

  • Verify the input directory path is correct

  • Check that images are in a supported format (JPG, PNG, JPEG, TIFF)

  • Ensure RGBImageFile is used correctly with setFilePath()

Multiprocessing errors:

  • Ensure _processImage() doesn’t access shared mutable state

  • Check that all objects passed between processes are picklable

  • Move file I/O to _preRun() or _postRun() if needed

Next Steps

  • Review Complete Example Analysis for a complete working example

  • Study existing analyses in Granny/Analyses/ for patterns

  • Refer to API Reference for detailed API documentation

  • Consider contributing your analysis back to the project