Comprehensive validation of AI models and diagnostic tests using cross-validation, model selection, and advanced performance metrics. Designed for AI diagnostic research including comparison of AI vs human performance with statistical significance testing.
Usage
aivalidation(
  data,
  predictorVars,
  outcomeVar,
  positiveLevel,
  referencePredictor,
  crossValidation = "10-fold",
  nRepeats = 10,
  stratified = TRUE,
  randomSeed = 42,
  modelSelection = "AIC",
  selectionDirection = "both",
  compareModels = TRUE,
  delongTest = TRUE,
  mcnemarTest = FALSE,
  calibrationTest = TRUE,
  calculateNRI = TRUE,
  calculateIDI = TRUE,
  youdensJ = TRUE,
  bootstrapCI = TRUE,
  nBootstrap = 1000,
  showModelSelection = TRUE,
  showCalibration = TRUE,
  showCrossValidation = TRUE,
  showComparison = TRUE,
  rocPlot = TRUE,
  calibrationPlot = TRUE,
  comparisonPlot = TRUE,
  cvPerformancePlot = TRUE,
  variableImportancePlot = FALSE,
  showExplanations = TRUE,
  showSummaries = TRUE,
  confidenceLevel = 0.95
)Arguments
- data
- the data as a data frame 
- predictorVars
- a vector of strings naming the predictor variables (AI scores, human scores, biomarkers, etc.) from - data
- outcomeVar
- a string naming the binary outcome variable (gold standard) from - data
- positiveLevel
- the level of the outcome variable which represents the positive case 
- referencePredictor
- reference predictor for model comparisons (typically AI model or main biomarker) 
- crossValidation
- cross-validation method for model validation 
- nRepeats
- number of repetitions for repeated cross-validation 
- stratified
- maintain outcome variable proportions across folds 
- randomSeed
- random seed for reproducible results 
- modelSelection
- method for automatic model selection and variable importance 
- selectionDirection
- direction for stepwise model selection 
- compareModels
- perform statistical comparison between models 
- delongTest
- perform DeLong test for comparing AUC values 
- mcnemarTest
- perform McNemar's test for paired binary predictions 
- calibrationTest
- perform Hosmer-Lemeshow calibration test 
- calculateNRI
- calculate Net Reclassification Index with confidence intervals 
- calculateIDI
- calculate Integrated Discrimination Index with confidence intervals 
- youdensJ
- calculate Youden's J statistic for optimal cutoff determination 
- bootstrapCI
- use bootstrap methods for confidence interval estimation 
- nBootstrap
- number of bootstrap iterations for confidence intervals 
- showModelSelection
- display model selection process and variable importance 
- showCalibration
- display calibration plots and statistics 
- showCrossValidation
- display detailed cross-validation results 
- showComparison
- display statistical comparison between models 
- rocPlot
- generate ROC curves with cross-validation confidence bands 
- calibrationPlot
- generate calibration plots showing observed vs predicted probabilities 
- comparisonPlot
- generate forest plot comparing model performance 
- cvPerformancePlot
- generate plots showing cross-validation performance across folds 
- variableImportancePlot
- generate variable importance plot from model selection 
- showExplanations
- show explanations for methods and interpretations 
- showSummaries
- show summary interpretations of results 
- confidenceLevel
- confidence level for confidence intervals 
Value
A results object containing:
| results$todo | a html | ||||
| results$cvPerformanceTable | Performance metrics calculated using cross-validation | ||||
| results$modelSelectionTable | Results from automatic model selection process | ||||
| results$modelComparisonTable | Statistical comparison between different models | ||||
| results$nriIdiTable | Net Reclassification Index and Integrated Discrimination Index | ||||
| results$calibrationTable | Model calibration assessment including Hosmer-Lemeshow test | ||||
| results$variableImportanceTable | Importance scores for variables in selected models | ||||
| results$cvFoldResults | Performance metrics for each cross-validation fold | ||||
| results$rocPlot | ROC curves showing cross-validated performance with confidence bands | ||||
| results$calibrationPlot | Calibration plot showing observed vs predicted probabilities | ||||
| results$comparisonPlot | Forest plot comparing model performance with confidence intervals | ||||
| results$cvPerformancePlot | Box plots showing performance distribution across CV folds | ||||
| results$variableImportancePlot | Bar plot showing variable importance from model selection | ||||
| results$methodologyExplanation | a html | ||||
| results$resultsInterpretation | a html | ||||
| results$statisticalNotes | a html | ||||
| results$recommendationsText | a html | 
Tables can be converted to data frames with asDF or as.data.frame. For example:
results$cvPerformanceTable$asDF
as.data.frame(results$cvPerformanceTable)