Segmentation Metrics (Dice, IoU, Hausdorff)
Source:R/segmentationmetrics.h.R
segmentationmetrics.RdComprehensive validation metrics for image segmentation tasks in digital pathology and AI-based tissue analysis. Evaluates overlap and boundary accuracy between AI-predicted segmentations and expert-annotated ground truth. Essential metrics include Dice Coefficient (F1-score for spatial overlap), Jaccard Index (IoU - Intersection over Union), Hausdorff Distance (maximum boundary deviation), and Surface Distance metrics. Designed for tumor boundary delineation, gland segmentation, cell nuclei detection, tissue region classification, and any pixel-level or region-based segmentation task. Supports binary segmentation (single structure), multi-class segmentation (multiple tissue types), and instance segmentation (individual object detection). Provides statistical analysis across multiple images, stratification by image characteristics (magnification, staining, scanner), and clinical interpretation of segmentation quality. Critical for validating AI algorithms before deployment in diagnostic workflows, comparing segmentation methods, and establishing performance benchmarks for digital pathology systems.
Usage
segmentationmetrics(
data,
prediction_mask,
ground_truth_mask,
image_id,
segmentation_type = "binary",
positive_class,
dice_coefficient = TRUE,
jaccard_index = TRUE,
volumetric_similarity = FALSE,
sensitivity_specificity = TRUE,
hausdorff_distance = TRUE,
average_hausdorff = TRUE,
surface_distance = TRUE,
surface_overlap = FALSE,
boundary_tolerance = 2,
pixel_size_provided = FALSE,
pixel_size_x = 0.5,
pixel_size_y = 0.5,
class_specific_metrics = TRUE,
macro_average = TRUE,
weighted_average = TRUE,
object_detection_metrics = FALSE,
iou_threshold = 0.5,
count_metrics = FALSE,
confidence_intervals = TRUE,
bootstrap_ci = FALSE,
bootstrap_samples = 1000,
confidence_level = 0.95,
quality_thresholds = TRUE,
dice_threshold_excellent = 0.9,
dice_threshold_good = 0.8,
dice_threshold_acceptable = 0.7,
stratified_analysis = FALSE,
stratify_by,
outlier_detection = TRUE,
outlier_method = "iqr",
plot_metric_distribution = TRUE,
plot_scatter_comparison = TRUE,
plot_boundary_error = TRUE,
plot_confusion_matrix = FALSE,
plot_performance_by_class = FALSE,
application_context = "general",
show_interpretation = TRUE,
paired_analysis = FALSE,
comparison_method,
missing_handling = "complete",
random_seed = 123
)Arguments
- data
The data as a data frame.
- prediction_mask
AI-predicted segmentation mask. For binary segmentation, this is a binary variable (0/1 or background/foreground). For multi-class, this contains class labels. Can be pixel-level or region-level data.
- ground_truth_mask
Expert-annotated ground truth segmentation mask. Must have the same encoding scheme as prediction_mask.
- image_id
Variable identifying individual images or regions. Used to aggregate metrics per image and calculate summary statistics across images.
- segmentation_type
Type of segmentation task. Binary for single structure (e.g., tumor vs background), multi-class for multiple tissue types (e.g., epithelium/stroma/necrosis), instance for individual object detection (e.g., separate cell nuclei).
- positive_class
For binary segmentation, specify which level represents the foreground/ structure of interest (e.g., tumor, gland, nucleus).
- dice_coefficient
Calculate Dice coefficient (also known as F1-score for segmentation). Measures spatial overlap: Dice = 2|A∩B| / (|A|+|B|). Range 0-1, where 1 = perfect overlap. Most commonly used segmentation metric.
- jaccard_index
Calculate Jaccard Index (Intersection over Union). Measures overlap: IoU = |A∩B| / |A∪B|. Range 0-1. Related to Dice: IoU = Dice/(2-Dice). Standard metric in computer vision and AI segmentation.
- volumetric_similarity
Calculate Volumetric Similarity coefficient for 3D segmentation or area-based similarity in 2D. Useful for volume/area preservation analysis.
- sensitivity_specificity
Calculate pixel-wise sensitivity (true positive rate) and specificity (true negative rate). Shows over-segmentation vs under-segmentation.
- hausdorff_distance
Calculate Hausdorff Distance - maximum distance from predicted boundary to ground truth boundary. Sensitive to outliers. Measures worst-case boundary error. Reported in pixels or mm if pixel size provided.
- average_hausdorff
Calculate Average Hausdorff Distance (95th percentile). More robust to outliers than maximum Hausdorff. Better represents typical boundary error.
- surface_distance
Calculate average distance between predicted and ground truth boundaries. Mean of all point-to-surface distances. Provides average boundary error.
- surface_overlap
Calculate Surface Dice (boundary-focused Dice). Only considers points near boundaries. Emphasizes boundary accuracy over volume accuracy.
- boundary_tolerance
Tolerance distance for surface/boundary metrics in pixels. Points within this distance are considered boundary points. Typical: 1-5 pixels depending on magnification.
- pixel_size_provided
Whether pixel/voxel physical dimensions are available for converting pixel-based metrics to physical distances (micrometers, millimeters).
- pixel_size_x
Physical size of one pixel in X dimension (micrometers). Used to convert pixel-based distances to micrometers. Typical WSI: 0.25-0.5 μm/pixel.
- pixel_size_y
Physical size of one pixel in Y dimension (micrometers).
- class_specific_metrics
For multi-class segmentation, calculate metrics separately for each class (one-vs-rest approach). Shows performance per tissue type.
- macro_average
Calculate macro-average (unweighted mean) of metrics across all classes. Treats all classes equally regardless of size.
- weighted_average
Calculate weighted average of metrics, weighted by class prevalence/size. Emphasizes performance on larger structures.
- object_detection_metrics
For instance segmentation, calculate object-level detection metrics: precision, recall, F1-score based on IoU threshold for matching objects.
- iou_threshold
Minimum IoU between predicted and ground truth objects to consider them matched. Standard: 0.5 for object detection, 0.7 for strict matching.
- count_metrics
Calculate object counting accuracy for instance segmentation (e.g., cell count accuracy, absolute/relative counting error).
- confidence_intervals
Calculate confidence intervals for metrics across images using bootstrap or normal approximation.
- bootstrap_ci
Use bootstrap resampling for confidence intervals instead of normal approximation. More accurate for small sample sizes or skewed distributions.
- bootstrap_samples
Number of bootstrap samples for confidence interval estimation.
- confidence_level
Confidence level for interval estimation (default 95 percent).
- quality_thresholds
Apply clinical quality thresholds to categorize segmentation performance (excellent/good/acceptable/poor) based on Dice/IoU values.
- dice_threshold_excellent
Dice coefficient threshold for "excellent" segmentation quality. Typical: ≥0.90 for clinical deployment.
- dice_threshold_good
Dice coefficient threshold for "good" segmentation quality.
- dice_threshold_acceptable
Dice coefficient threshold for minimally "acceptable" segmentation. Below this may require human review.
- stratified_analysis
Perform stratified analysis by image characteristics (scanner, magnification, staining protocol, tissue type) to assess consistency.
- stratify_by
Variable for stratified analysis (e.g., scanner_type, magnification, stain).
- outlier_detection
Detect outlier images with unusually poor segmentation performance. Flags images requiring expert review.
- outlier_method
Method for outlier detection across images.
- plot_metric_distribution
Plot distribution of metrics across images (histogram/violin plot). Shows variability in segmentation quality.
- plot_scatter_comparison
Scatter plot showing relationship between Dice and IoU across images.
- plot_boundary_error
Plot Hausdorff and surface distances to visualize boundary accuracy.
- plot_confusion_matrix
For multi-class segmentation, show confusion matrix of pixel classifications.
- plot_performance_by_class
Bar plot showing metrics for each class in multi-class segmentation.
- application_context
Clinical/research application context for interpretation guidance.
- show_interpretation
Provide interpretation of results including recommendations for clinical deployment based on performance metrics.
- paired_analysis
Perform paired statistical tests comparing segmentation performance between different AI models or methods on the same images.
- comparison_method
Variable identifying different segmentation methods for comparison.
- missing_handling
How to handle images with missing or incomplete segmentations.
- random_seed
Random seed for bootstrap sampling and other stochastic procedures.
Value
A results object containing:
results$instructions | a html | ||||
results$overallSummary | a table | ||||
results$overlapMetricsTable | a table | ||||
results$distanceMetricsTable | a table | ||||
results$multiclassMetricsTable | a table | ||||
results$instanceMetricsTable | a table | ||||
results$qualityAssessmentTable | a table | ||||
results$outlierImagesTable | a table | ||||
results$stratifiedAnalysisTable | a table | ||||
results$comparisonTable | a table | ||||
results$metricDistributionPlot | an image | ||||
results$scatterComparisonPlot | an image | ||||
results$boundaryErrorPlot | an image | ||||
results$confusionMatrixPlot | an image | ||||
results$performanceByClassPlot | an image | ||||
results$clinicalInterpretation | a html |
Tables can be converted to data frames with asDF or as.data.frame. For example:
results$overallSummary$asDF
as.data.frame(results$overallSummary)
Examples
# \donttest{
result <- segmentationmetrics(
data = segmentation_results,
prediction_mask = "ai_segmentation",
ground_truth_mask = "expert_annotation",
image_id = "slide_id",
metric_type = "all"
)
#> Error in segmentationmetrics(data = segmentation_results, prediction_mask = "ai_segmentation", ground_truth_mask = "expert_annotation", image_id = "slide_id", metric_type = "all"): unused argument (metric_type = "all")
# }