Skip to contents

Precision-Recall (PR) analysis is superior to ROC analysis for highly imbalanced datasets, which are common in digital pathology and AI applications. While ROC curves can be misleading when the negative class vastly outnumbers the positive class, PR curves focus on the performance in the minority (positive) class. The area under the PR curve (PR-AUC or Average Precision) provides a single-number summary of model performance. PR analysis is essential for evaluating rare event detection (mitotic figures <1\ positives are rare, quality control in digital pathology where defects are uncommon, and AI triage systems where abnormal cases are infrequent. This module provides comprehensive PR analysis with optimal threshold selection, comparison to baseline (random classifier), and F-score optimization for different precision-recall trade-offs.

Usage

prauc(
  data,
  outcome,
  predictor,
  positive_level = "",
  prevalence = -1,
  calculate_auc = TRUE,
  calculate_fscore = TRUE,
  beta_weights = "1, 2, 0.5",
  confidence_intervals = TRUE,
  ci_method = "bootstrap",
  bootstrap_samples = 1000,
  confidence_level = 0.95,
  compare_to_roc = TRUE,
  baseline_comparison = TRUE,
  interpolation_method = "step",
  plot_pr_curve = TRUE,
  plot_comparison = FALSE,
  plot_fscore = FALSE,
  min_threshold = 0,
  max_threshold = 1,
  random_seed = 12345
)

Arguments

data

The data as a data frame.

outcome

Binary outcome variable (e.g., cancer/non-cancer, positive/negative). Must have exactly two levels.

predictor

Continuous predictor variable (e.g., AI score, biomarker level, probability). Higher values should indicate higher likelihood of positive outcome.

positive_level

Level of outcome variable to treat as "positive" class. If not specified, the second level of the factor is used. Critical for correct PR analysis.

prevalence

Observed prevalence of positive class in dataset. Automatically calculated if not specified (-1). Used for baseline comparison and interpretation.

calculate_auc

Calculate area under precision-recall curve (also called Average Precision). Uses trapezoidal integration. Essential summary metric for imbalanced data.

calculate_fscore

Find optimal threshold that maximizes F1-score (harmonic mean of precision and recall). Also calculates F2-score (emphasizes recall) and F0.5-score (emphasizes precision).

beta_weights

Comma-separated list of beta values for F-score calculation. F1 (beta=1) balances precision and recall. F2 (beta=2) weights recall higher. F0.5 (beta=0.5) weights precision higher. Example: "1, 2, 0.5, 3"

confidence_intervals

Calculate confidence intervals for PR-AUC using bootstrap resampling. Provides uncertainty quantification for imbalanced datasets.

ci_method

Method for confidence interval calculation. Bootstrap percentile is standard. BCa (bias-corrected and accelerated) adjusts for skewness.

bootstrap_samples

Number of bootstrap resamples for confidence interval calculation.

confidence_level

Confidence level for interval estimation.

compare_to_roc

Calculate ROC-AUC for comparison. Demonstrates advantage of PR analysis for imbalanced data where ROC-AUC may be misleadingly high.

baseline_comparison

Compare PR-AUC to baseline (random classifier). Baseline PR-AUC equals the prevalence. Shows improvement over chance performance.

interpolation_method

Method for interpolating PR curve between points. Step function is standard for PR curves. Linear interpolation can smooth the curve.

plot_pr_curve

Plot precision-recall curve with optimal F-score threshold marked. Shows trade-off between precision and recall across all thresholds.

plot_comparison

Side-by-side comparison of ROC and PR curves to demonstrate differences in imbalanced dataset evaluation.

plot_fscore

Plot F-scores (F1, F2, F0.5) across all thresholds to visualize optimal operating points for different precision-recall trade-offs.

min_threshold

Minimum predictor threshold to evaluate. Use to focus on clinically relevant range.

max_threshold

Maximum predictor threshold to evaluate.

random_seed

Random seed for bootstrap procedures to ensure reproducibility.

Value

A results object containing:

results$instructionsa html
results$prSummaryPrecision-Recall area under curve with baseline comparison
results$optimalThresholdsOptimal operating points for different F-score metrics
results$prCurveDataPrecision and recall values across all thresholds
results$performanceAtKeyPrecision and recall at clinically relevant thresholds
results$prCurvePlotPrecision-recall curve with optimal F1 threshold marked
results$comparisonPlotSide-by-side comparison of ROC and PR curves
results$fscorePlotF-scores across thresholds for different beta values
results$interpretationa html

Tables can be converted to data frames with asDF or as.data.frame. For example:

results$prSummary$asDF

as.data.frame(results$prSummary)

Examples

result <- prauc(
    data = pathology_data,
    outcome = "cancer",
    predictor = "ai_score",
    positive_level = "positive"
)
#> Error: object 'pathology_data' not found