Skip to contents

Usage

segmentationmetrics(
  data,
  prediction_mask,
  ground_truth_mask,
  image_id,
  segmentation_type = "binary",
  positive_class,
  dice_coefficient = TRUE,
  jaccard_index = TRUE,
  volumetric_similarity = FALSE,
  sensitivity_specificity = TRUE,
  hausdorff_distance = TRUE,
  average_hausdorff = TRUE,
  surface_distance = TRUE,
  surface_overlap = FALSE,
  boundary_tolerance = 2,
  pixel_size_provided = FALSE,
  pixel_size_x = 0.5,
  pixel_size_y = 0.5,
  class_specific_metrics = TRUE,
  macro_average = TRUE,
  weighted_average = TRUE,
  object_detection_metrics = FALSE,
  iou_threshold = 0.5,
  count_metrics = FALSE,
  confidence_intervals = TRUE,
  bootstrap_ci = FALSE,
  bootstrap_samples = 1000,
  confidence_level = 0.95,
  quality_thresholds = TRUE,
  dice_threshold_excellent = 0.9,
  dice_threshold_good = 0.8,
  dice_threshold_acceptable = 0.7,
  stratified_analysis = FALSE,
  stratify_by,
  outlier_detection = TRUE,
  outlier_method = "iqr",
  plot_metric_distribution = TRUE,
  plot_scatter_comparison = TRUE,
  plot_boundary_error = TRUE,
  plot_confusion_matrix = FALSE,
  plot_performance_by_class = FALSE,
  application_context = "general",
  show_interpretation = TRUE,
  paired_analysis = FALSE,
  comparison_method,
  missing_handling = "complete",
  random_seed = 123
)

Arguments

data

The data as a data frame.

prediction_mask

AI-predicted segmentation mask. For binary segmentation, this is a binary variable (0/1 or background/foreground). For multi-class, this contains class labels. Can be pixel-level or region-level data.

ground_truth_mask

Expert-annotated ground truth segmentation mask. Must have the same encoding scheme as prediction_mask.

image_id

Variable identifying individual images or regions. Used to aggregate metrics per image and calculate summary statistics across images.

segmentation_type

Type of segmentation task. Binary for single structure (e.g., tumor vs background), multi-class for multiple tissue types (e.g., epithelium/stroma/necrosis), instance for individual object detection (e.g., separate cell nuclei).

positive_class

For binary segmentation, specify which level represents the foreground/ structure of interest (e.g., tumor, gland, nucleus).

dice_coefficient

Calculate Dice coefficient (also known as F1-score for segmentation). Measures spatial overlap: Dice = 2|A∩B| / (|A|+|B|). Range 0-1, where 1 = perfect overlap. Most commonly used segmentation metric.

jaccard_index

Calculate Jaccard Index (Intersection over Union). Measures overlap: IoU = |A∩B| / |A∪B|. Range 0-1. Related to Dice: IoU = Dice/(2-Dice). Standard metric in computer vision and AI segmentation.

volumetric_similarity

Calculate Volumetric Similarity coefficient for 3D segmentation or area-based similarity in 2D. Useful for volume/area preservation analysis.

sensitivity_specificity

Calculate pixel-wise sensitivity (true positive rate) and specificity (true negative rate). Shows over-segmentation vs under-segmentation.

hausdorff_distance

Calculate Hausdorff Distance - maximum distance from predicted boundary to ground truth boundary. Sensitive to outliers. Measures worst-case boundary error. Reported in pixels or mm if pixel size provided.

average_hausdorff

Calculate Average Hausdorff Distance (95th percentile). More robust to outliers than maximum Hausdorff. Better represents typical boundary error.

surface_distance

Calculate average distance between predicted and ground truth boundaries. Mean of all point-to-surface distances. Provides average boundary error.

surface_overlap

Calculate Surface Dice (boundary-focused Dice). Only considers points near boundaries. Emphasizes boundary accuracy over volume accuracy.

boundary_tolerance

Tolerance distance for surface/boundary metrics in pixels. Points within this distance are considered boundary points. Typical: 1-5 pixels depending on magnification.

pixel_size_provided

Whether pixel/voxel physical dimensions are available for converting pixel-based metrics to physical distances (micrometers, millimeters).

pixel_size_x

Physical size of one pixel in X dimension (micrometers). Used to convert pixel-based distances to micrometers. Typical WSI: 0.25-0.5 μm/pixel.

pixel_size_y

Physical size of one pixel in Y dimension (micrometers).

class_specific_metrics

For multi-class segmentation, calculate metrics separately for each class (one-vs-rest approach). Shows performance per tissue type.

macro_average

Calculate macro-average (unweighted mean) of metrics across all classes. Treats all classes equally regardless of size.

weighted_average

Calculate weighted average of metrics, weighted by class prevalence/size. Emphasizes performance on larger structures.

object_detection_metrics

For instance segmentation, calculate object-level detection metrics: precision, recall, F1-score based on IoU threshold for matching objects.

iou_threshold

Minimum IoU between predicted and ground truth objects to consider them matched. Standard: 0.5 for object detection, 0.7 for strict matching.

count_metrics

Calculate object counting accuracy for instance segmentation (e.g., cell count accuracy, absolute/relative counting error).

confidence_intervals

Calculate confidence intervals for metrics across images using bootstrap or normal approximation.

bootstrap_ci

Use bootstrap resampling for confidence intervals instead of normal approximation. More accurate for small sample sizes or skewed distributions.

bootstrap_samples

Number of bootstrap samples for confidence interval estimation.

confidence_level

Confidence level for interval estimation (default 95\

quality_thresholdsApply clinical quality thresholds to categorize segmentation performance (excellent/good/acceptable/poor) based on Dice/IoU values.

dice_threshold_excellentDice coefficient threshold for "excellent" segmentation quality. Typical: ≥0.90 for clinical deployment.

dice_threshold_goodDice coefficient threshold for "good" segmentation quality.

dice_threshold_acceptableDice coefficient threshold for minimally "acceptable" segmentation. Below this may require human review.

stratified_analysisPerform stratified analysis by image characteristics (scanner, magnification, staining protocol, tissue type) to assess consistency.

stratify_byVariable for stratified analysis (e.g., scanner_type, magnification, stain).

outlier_detectionDetect outlier images with unusually poor segmentation performance. Flags images requiring expert review.

outlier_methodMethod for outlier detection across images.

plot_metric_distributionPlot distribution of metrics across images (histogram/violin plot). Shows variability in segmentation quality.

plot_scatter_comparisonScatter plot showing relationship between Dice and IoU across images.

plot_boundary_errorPlot Hausdorff and surface distances to visualize boundary accuracy.

plot_confusion_matrixFor multi-class segmentation, show confusion matrix of pixel classifications.

plot_performance_by_classBar plot showing metrics for each class in multi-class segmentation.

application_contextClinical/research application context for interpretation guidance.

show_interpretationProvide interpretation of results including recommendations for clinical deployment based on performance metrics.

paired_analysisPerform paired statistical tests comparing segmentation performance between different AI models or methods on the same images.

comparison_methodVariable identifying different segmentation methods for comparison.

missing_handlingHow to handle images with missing or incomplete segmentations.

random_seedRandom seed for bootstrap sampling and other stochastic procedures.

A results object containing:

results$instructionsa html
results$overallSummarya table
results$overlapMetricsTablea table
results$distanceMetricsTablea table
results$multiclassMetricsTablea table
results$instanceMetricsTablea table
results$qualityAssessmentTablea table
results$outlierImagesTablea table
results$stratifiedAnalysisTablea table
results$comparisonTablea table
results$metricDistributionPlotan image
results$scatterComparisonPlotan image
results$boundaryErrorPlotan image
results$confusionMatrixPlotan image
results$performanceByClassPlotan image
results$clinicalInterpretationa html
Tables can be converted to data frames with asDF or as.data.frame. For example:results$overallSummary$asDFas.data.frame(results$overallSummary) Comprehensive validation metrics for image segmentation tasks in digital pathology and AI-based tissue analysis. Evaluates overlap and boundary accuracy between AI-predicted segmentations and expert-annotated ground truth. Essential metrics include Dice Coefficient (F1-score for spatial overlap), Jaccard Index (IoU - Intersection over Union), Hausdorff Distance (maximum boundary deviation), and Surface Distance metrics. Designed for tumor boundary delineation, gland segmentation, cell nuclei detection, tissue region classification, and any pixel-level or region-based segmentation task. Supports binary segmentation (single structure), multi-class segmentation (multiple tissue types), and instance segmentation (individual object detection). Provides statistical analysis across multiple images, stratification by image characteristics (magnification, staining, scanner), and clinical interpretation of segmentation quality. Critical for validating AI algorithms before deployment in diagnostic workflows, comparing segmentation methods, and establishing performance benchmarks for digital pathology systems. result <- segmentationmetrics( data = segmentation_results, prediction_mask = "ai_segmentation", ground_truth_mask = "expert_annotation", image_id = "slide_id", metric_type = "all" )