Segmentation Metrics (Dice, IoU, Hausdorff)
Source:R/segmentationmetrics.h.R
segmentationmetrics.RdUsage
segmentationmetrics(
data,
prediction_mask,
ground_truth_mask,
image_id,
segmentation_type = "binary",
positive_class,
dice_coefficient = TRUE,
jaccard_index = TRUE,
volumetric_similarity = FALSE,
sensitivity_specificity = TRUE,
hausdorff_distance = TRUE,
average_hausdorff = TRUE,
surface_distance = TRUE,
surface_overlap = FALSE,
boundary_tolerance = 2,
pixel_size_provided = FALSE,
pixel_size_x = 0.5,
pixel_size_y = 0.5,
class_specific_metrics = TRUE,
macro_average = TRUE,
weighted_average = TRUE,
object_detection_metrics = FALSE,
iou_threshold = 0.5,
count_metrics = FALSE,
confidence_intervals = TRUE,
bootstrap_ci = FALSE,
bootstrap_samples = 1000,
confidence_level = 0.95,
quality_thresholds = TRUE,
dice_threshold_excellent = 0.9,
dice_threshold_good = 0.8,
dice_threshold_acceptable = 0.7,
stratified_analysis = FALSE,
stratify_by,
outlier_detection = TRUE,
outlier_method = "iqr",
plot_metric_distribution = TRUE,
plot_scatter_comparison = TRUE,
plot_boundary_error = TRUE,
plot_confusion_matrix = FALSE,
plot_performance_by_class = FALSE,
application_context = "general",
show_interpretation = TRUE,
paired_analysis = FALSE,
comparison_method,
missing_handling = "complete",
random_seed = 123
)Arguments
- data
The data as a data frame.
- prediction_mask
AI-predicted segmentation mask. For binary segmentation, this is a binary variable (0/1 or background/foreground). For multi-class, this contains class labels. Can be pixel-level or region-level data.
- ground_truth_mask
Expert-annotated ground truth segmentation mask. Must have the same encoding scheme as prediction_mask.
- image_id
Variable identifying individual images or regions. Used to aggregate metrics per image and calculate summary statistics across images.
- segmentation_type
Type of segmentation task. Binary for single structure (e.g., tumor vs background), multi-class for multiple tissue types (e.g., epithelium/stroma/necrosis), instance for individual object detection (e.g., separate cell nuclei).
- positive_class
For binary segmentation, specify which level represents the foreground/ structure of interest (e.g., tumor, gland, nucleus).
- dice_coefficient
Calculate Dice coefficient (also known as F1-score for segmentation). Measures spatial overlap: Dice = 2|A∩B| / (|A|+|B|). Range 0-1, where 1 = perfect overlap. Most commonly used segmentation metric.
- jaccard_index
Calculate Jaccard Index (Intersection over Union). Measures overlap: IoU = |A∩B| / |A∪B|. Range 0-1. Related to Dice: IoU = Dice/(2-Dice). Standard metric in computer vision and AI segmentation.
- volumetric_similarity
Calculate Volumetric Similarity coefficient for 3D segmentation or area-based similarity in 2D. Useful for volume/area preservation analysis.
- sensitivity_specificity
Calculate pixel-wise sensitivity (true positive rate) and specificity (true negative rate). Shows over-segmentation vs under-segmentation.
- hausdorff_distance
Calculate Hausdorff Distance - maximum distance from predicted boundary to ground truth boundary. Sensitive to outliers. Measures worst-case boundary error. Reported in pixels or mm if pixel size provided.
- average_hausdorff
Calculate Average Hausdorff Distance (95th percentile). More robust to outliers than maximum Hausdorff. Better represents typical boundary error.
- surface_distance
Calculate average distance between predicted and ground truth boundaries. Mean of all point-to-surface distances. Provides average boundary error.
- surface_overlap
Calculate Surface Dice (boundary-focused Dice). Only considers points near boundaries. Emphasizes boundary accuracy over volume accuracy.
- boundary_tolerance
Tolerance distance for surface/boundary metrics in pixels. Points within this distance are considered boundary points. Typical: 1-5 pixels depending on magnification.
- pixel_size_provided
Whether pixel/voxel physical dimensions are available for converting pixel-based metrics to physical distances (micrometers, millimeters).
- pixel_size_x
Physical size of one pixel in X dimension (micrometers). Used to convert pixel-based distances to micrometers. Typical WSI: 0.25-0.5 μm/pixel.
- pixel_size_y
Physical size of one pixel in Y dimension (micrometers).
- class_specific_metrics
For multi-class segmentation, calculate metrics separately for each class (one-vs-rest approach). Shows performance per tissue type.
- macro_average
Calculate macro-average (unweighted mean) of metrics across all classes. Treats all classes equally regardless of size.
- weighted_average
Calculate weighted average of metrics, weighted by class prevalence/size. Emphasizes performance on larger structures.
- object_detection_metrics
For instance segmentation, calculate object-level detection metrics: precision, recall, F1-score based on IoU threshold for matching objects.
- iou_threshold
Minimum IoU between predicted and ground truth objects to consider them matched. Standard: 0.5 for object detection, 0.7 for strict matching.
- count_metrics
Calculate object counting accuracy for instance segmentation (e.g., cell count accuracy, absolute/relative counting error).
- confidence_intervals
Calculate confidence intervals for metrics across images using bootstrap or normal approximation.
- bootstrap_ci
Use bootstrap resampling for confidence intervals instead of normal approximation. More accurate for small sample sizes or skewed distributions.
- bootstrap_samples
Number of bootstrap samples for confidence interval estimation.
- confidence_level
Confidence level for interval estimation (default 95\
quality_thresholdsApply clinical quality thresholds to categorize segmentation performance (excellent/good/acceptable/poor) based on Dice/IoU values.
dice_threshold_excellentDice coefficient threshold for "excellent" segmentation quality. Typical: ≥0.90 for clinical deployment.
dice_threshold_goodDice coefficient threshold for "good" segmentation quality.
dice_threshold_acceptableDice coefficient threshold for minimally "acceptable" segmentation. Below this may require human review.
stratified_analysisPerform stratified analysis by image characteristics (scanner, magnification, staining protocol, tissue type) to assess consistency.
stratify_byVariable for stratified analysis (e.g., scanner_type, magnification, stain).
outlier_detectionDetect outlier images with unusually poor segmentation performance. Flags images requiring expert review.
outlier_methodMethod for outlier detection across images.
plot_metric_distributionPlot distribution of metrics across images (histogram/violin plot). Shows variability in segmentation quality.
plot_scatter_comparisonScatter plot showing relationship between Dice and IoU across images.
plot_boundary_errorPlot Hausdorff and surface distances to visualize boundary accuracy.
plot_confusion_matrixFor multi-class segmentation, show confusion matrix of pixel classifications.
plot_performance_by_classBar plot showing metrics for each class in multi-class segmentation.
application_contextClinical/research application context for interpretation guidance.
show_interpretationProvide interpretation of results including recommendations for clinical deployment based on performance metrics.
paired_analysisPerform paired statistical tests comparing segmentation performance between different AI models or methods on the same images.
comparison_methodVariable identifying different segmentation methods for comparison.
missing_handlingHow to handle images with missing or incomplete segmentations.
random_seedRandom seed for bootstrap sampling and other stochastic procedures.
A results object containing:
results$instructions | a html | ||||
results$overallSummary | a table | ||||
results$overlapMetricsTable | a table | ||||
results$distanceMetricsTable | a table | ||||
results$multiclassMetricsTable | a table | ||||
results$instanceMetricsTable | a table | ||||
results$qualityAssessmentTable | a table | ||||
results$outlierImagesTable | a table | ||||
results$stratifiedAnalysisTable | a table | ||||
results$comparisonTable | a table | ||||
results$metricDistributionPlot | an image | ||||
results$scatterComparisonPlot | an image | ||||
results$boundaryErrorPlot | an image | ||||
results$confusionMatrixPlot | an image | ||||
results$performanceByClassPlot | an image | ||||
results$clinicalInterpretation | a html |
asDF or as.data.frame. For example:results$overallSummary$asDFas.data.frame(results$overallSummary)
Comprehensive validation metrics for image segmentation tasks in digital
pathology and AI-based tissue analysis. Evaluates overlap and boundary
accuracy between AI-predicted segmentations and expert-annotated ground
truth. Essential metrics include Dice Coefficient (F1-score for spatial
overlap), Jaccard Index (IoU - Intersection over Union), Hausdorff Distance
(maximum boundary deviation), and Surface Distance metrics. Designed for
tumor boundary delineation, gland segmentation, cell nuclei detection,
tissue region classification, and any pixel-level or region-based
segmentation task. Supports binary segmentation (single structure),
multi-class segmentation (multiple tissue types), and instance segmentation
(individual object detection). Provides statistical analysis across
multiple images, stratification by image characteristics (magnification,
staining, scanner), and clinical interpretation of segmentation quality.
Critical for validating AI algorithms before deployment in diagnostic
workflows, comparing segmentation methods, and establishing performance
benchmarks for digital pathology systems.
result <- segmentationmetrics(
data = segmentation_results,
prediction_mask = "ai_segmentation",
ground_truth_mask = "expert_annotation",
image_id = "slide_id",
metric_type = "all"
)