AI Model Validation with Cross-Validation — aivalidation • ClinicoPath

Comprehensive validation of AI models and diagnostic tests using cross-validation, model selection, and advanced performance metrics. Designed for AI diagnostic research including comparison of AI vs human performance with statistical significance testing.

Usage

aivalidation(
  data,
  predictorVars,
  outcomeVar,
  positiveLevel,
  referencePredictor,
  crossValidation = "10-fold",
  nRepeats = 10,
  stratified = TRUE,
  randomSeed = 42,
  modelSelection = "AIC",
  selectionDirection = "both",
  compareModels = TRUE,
  delongTest = TRUE,
  mcnemarTest = FALSE,
  calibrationTest = TRUE,
  calculateNRI = TRUE,
  calculateIDI = TRUE,
  youdensJ = TRUE,
  bootstrapCI = TRUE,
  nBootstrap = 1000,
  showModelSelection = TRUE,
  showCalibration = TRUE,
  showCrossValidation = TRUE,
  showComparison = TRUE,
  rocPlot = TRUE,
  calibrationPlot = TRUE,
  comparisonPlot = TRUE,
  cvPerformancePlot = TRUE,
  variableImportancePlot = FALSE,
  showExplanations = TRUE,
  showSummaries = TRUE,
  confidenceLevel = 0.95
)

Arguments

data: the data as a data frame
predictorVars: a vector of strings naming the predictor variables (AI scores, human scores, biomarkers, etc.) from data
outcomeVar: a string naming the binary outcome variable (gold standard) from data
positiveLevel: the level of the outcome variable which represents the positive case
referencePredictor: reference predictor for model comparisons (typically AI model or main biomarker)
crossValidation: cross-validation method for model validation
nRepeats: number of repetitions for repeated cross-validation
stratified: maintain outcome variable proportions across folds
randomSeed: random seed for reproducible results
modelSelection: method for automatic model selection and variable importance
selectionDirection: direction for stepwise model selection
compareModels: perform statistical comparison between models
delongTest: perform DeLong test for comparing AUC values
mcnemarTest: perform McNemar's test for paired binary predictions
calibrationTest: perform Hosmer-Lemeshow calibration test
calculateNRI: calculate Net Reclassification Index with confidence intervals
calculateIDI: calculate Integrated Discrimination Index with confidence intervals
youdensJ: calculate Youden's J statistic for optimal cutoff determination
bootstrapCI: use bootstrap methods for confidence interval estimation
nBootstrap: number of bootstrap iterations for confidence intervals
showModelSelection: display model selection process and variable importance
showCalibration: display calibration plots and statistics
showCrossValidation: display detailed cross-validation results
showComparison: display statistical comparison between models
rocPlot: generate ROC curves with cross-validation confidence bands
calibrationPlot: generate calibration plots showing observed vs predicted probabilities
comparisonPlot: generate forest plot comparing model performance
cvPerformancePlot: generate plots showing cross-validation performance across folds
variableImportancePlot: generate variable importance plot from model selection
showExplanations: show explanations for methods and interpretations
showSummaries: show summary interpretations of results
confidenceLevel: confidence level for confidence intervals

Value

A results object containing:

`results$todo`					a html
`results$cvPerformanceTable`					Performance metrics calculated using cross-validation
`results$modelSelectionTable`					Results from automatic model selection process
`results$modelComparisonTable`					Statistical comparison between different models
`results$nriIdiTable`					Net Reclassification Index and Integrated Discrimination Index
`results$calibrationTable`					Model calibration assessment including Hosmer-Lemeshow test
`results$variableImportanceTable`					Importance scores for variables in selected models
`results$cvFoldResults`					Performance metrics for each cross-validation fold
`results$rocPlot`					ROC curves showing cross-validated performance with confidence bands
`results$calibrationPlot`					Calibration plot showing observed vs predicted probabilities
`results$comparisonPlot`					Forest plot comparing model performance with confidence intervals
`results$cvPerformancePlot`					Box plots showing performance distribution across CV folds
`results$variableImportancePlot`					Bar plot showing variable importance from model selection
`results$methodologyExplanation`					a html
`results$resultsInterpretation`					a html
`results$statisticalNotes`					a html
`results$recommendationsText`					a html

Tables can be converted to data frames with asDF or as.data.frame. For example:

results$cvPerformanceTable$asDF

as.data.frame(results$cvPerformanceTable)

Examples

data('medical_ai_data', package='ClinicoPath')

aivalidation(data = medical_ai_data,
            predictorVars = c('AI_score', 'human_score', 'biomarker1'),
            outcomeVar = 'diagnosis',
            positiveLevel = 'positive',
            crossValidation = '10-fold',
            modelSelection = 'AIC',
            compareModels = TRUE,
            delongTest = TRUE)