Skip to contents

Comprehensive validation of AI models and diagnostic tests using cross-validation, model selection, and advanced performance metrics. Designed for AI diagnostic research including comparison of AI vs human performance with statistical significance testing.

Usage

aivalidation(
  data,
  predictorVars,
  outcomeVar,
  positiveLevel,
  referencePredictor,
  crossValidation = "10-fold",
  nRepeats = 10,
  stratified = TRUE,
  randomSeed = 42,
  modelSelection = "AIC",
  selectionDirection = "both",
  compareModels = TRUE,
  delongTest = TRUE,
  mcnemarTest = FALSE,
  calibrationTest = TRUE,
  calculateNRI = TRUE,
  calculateIDI = TRUE,
  youdensJ = TRUE,
  bootstrapCI = TRUE,
  nBootstrap = 1000,
  showModelSelection = TRUE,
  showCalibration = TRUE,
  showCrossValidation = TRUE,
  showComparison = TRUE,
  rocPlot = TRUE,
  calibrationPlot = TRUE,
  comparisonPlot = TRUE,
  cvPerformancePlot = TRUE,
  variableImportancePlot = FALSE,
  showExplanations = TRUE,
  showSummaries = TRUE,
  confidenceLevel = 0.95
)

Arguments

data

the data as a data frame

predictorVars

a vector of strings naming the predictor variables (AI scores, human scores, biomarkers, etc.) from data

outcomeVar

a string naming the binary outcome variable (gold standard) from data

positiveLevel

the level of the outcome variable which represents the positive case

referencePredictor

reference predictor for model comparisons (typically AI model or main biomarker)

crossValidation

cross-validation method for model validation

nRepeats

number of repetitions for repeated cross-validation

stratified

maintain outcome variable proportions across folds

randomSeed

random seed for reproducible results

modelSelection

method for automatic model selection and variable importance

selectionDirection

direction for stepwise model selection

compareModels

perform statistical comparison between models

delongTest

perform DeLong test for comparing AUC values

mcnemarTest

perform McNemar's test for paired binary predictions

calibrationTest

perform Hosmer-Lemeshow calibration test

calculateNRI

calculate Net Reclassification Index with confidence intervals

calculateIDI

calculate Integrated Discrimination Index with confidence intervals

youdensJ

calculate Youden's J statistic for optimal cutoff determination

bootstrapCI

use bootstrap methods for confidence interval estimation

nBootstrap

number of bootstrap iterations for confidence intervals

showModelSelection

display model selection process and variable importance

showCalibration

display calibration plots and statistics

showCrossValidation

display detailed cross-validation results

showComparison

display statistical comparison between models

rocPlot

generate ROC curves with cross-validation confidence bands

calibrationPlot

generate calibration plots showing observed vs predicted probabilities

comparisonPlot

generate forest plot comparing model performance

cvPerformancePlot

generate plots showing cross-validation performance across folds

variableImportancePlot

generate variable importance plot from model selection

showExplanations

show explanations for methods and interpretations

showSummaries

show summary interpretations of results

confidenceLevel

confidence level for confidence intervals

Value

A results object containing:

results$todoa html
results$cvPerformanceTablePerformance metrics calculated using cross-validation
results$modelSelectionTableResults from automatic model selection process
results$modelComparisonTableStatistical comparison between different models
results$nriIdiTableNet Reclassification Index and Integrated Discrimination Index
results$calibrationTableModel calibration assessment including Hosmer-Lemeshow test
results$variableImportanceTableImportance scores for variables in selected models
results$cvFoldResultsPerformance metrics for each cross-validation fold
results$rocPlotROC curves showing cross-validated performance with confidence bands
results$calibrationPlotCalibration plot showing observed vs predicted probabilities
results$comparisonPlotForest plot comparing model performance with confidence intervals
results$cvPerformancePlotBox plots showing performance distribution across CV folds
results$variableImportancePlotBar plot showing variable importance from model selection
results$methodologyExplanationa html
results$resultsInterpretationa html
results$statisticalNotesa html
results$recommendationsTexta html

Tables can be converted to data frames with asDF or as.data.frame. For example:

results$cvPerformanceTable$asDF

as.data.frame(results$cvPerformanceTable)

Examples

data('medical_ai_data', package='ClinicoPath')

aivalidation(data = medical_ai_data,
            predictorVars = c('AI_score', 'human_score', 'biomarker1'),
            outcomeVar = 'diagnosis',
            positiveLevel = 'positive',
            crossValidation = '10-fold',
            modelSelection = 'AIC',
            compareModels = TRUE,
            delongTest = TRUE)