Simplified AI model validation tool for comparing diagnostic performance. Calculates AUC, sensitivity, and specificity for predictor variables and performs statistical comparison using DeLong test.
Usage
aivalidation(
data,
predictorVars,
outcomeVar = NULL,
positiveLevel,
compareModels = FALSE,
youdensJ = FALSE,
matthewsCC = FALSE,
bootstrapCI = FALSE,
nBootstrap = 1000,
rocPlot = FALSE,
crossValidation = "none",
stratified = TRUE,
randomSeed = 42,
showExplanations = FALSE,
showSummaries = FALSE
)Arguments
- data
the data as a data frame
- predictorVars
a vector of strings naming the predictor variables (AI scores, human scores, biomarkers, etc.) from
data. Limited to first 5 for pairwise comparisons.- outcomeVar
a string naming the binary outcome variable (gold standard) from
data- positiveLevel
the level of the outcome variable which represents the positive case
- compareModels
perform statistical comparison between models using DeLong test for AUC comparison
- youdensJ
calculate and display Youden's J statistic (Sensitivity + Specificity - 1)
- matthewsCC
calculate and display Matthews Correlation Coefficient (MCC)
- bootstrapCI
use bootstrap resampling for confidence intervals (more robust for small samples)
- nBootstrap
number of bootstrap iterations (higher values are more accurate but slower)
- rocPlot
generate ROC curves for all predictor variables
- crossValidation
cross-validation method for model validation (simplified to avoid resource limits)
- stratified
maintain outcome variable proportions across folds
- randomSeed
random seed for reproducible cross-validation results
- showExplanations
show detailed methodology explanations
- showSummaries
show interpretation summaries of results
Value
A results object containing:
results$instructions | a html | ||||
results$performanceTable | Performance metrics for each predictor variable | ||||
results$comparisonTable | Statistical comparison between predictor models using DeLong test | ||||
results$cvPerformanceTable | Cross-validated performance metrics for each predictor | ||||
results$rocPlot | ROC curves for all predictor models | ||||
results$methodologyExplanation | a html | ||||
results$resultsInterpretation | a html |
Tables can be converted to data frames with asDF or as.data.frame. For example:
results$performanceTable$asDF
as.data.frame(results$performanceTable)
Examples
# \donttest{
data('medical_ai_data', package='ClinicoPath')
aivalidation(data = medical_ai_data,
predictorVars = c('AI_score', 'human_score', 'biomarker1'),
outcomeVar = 'diagnosis',
positiveLevel = 'positive',
compareModels = TRUE)
#>
#> AI MODEL VALIDATION
#>
#> Model Performance Metrics
#> ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
#> Predictor AUC AUC 95% CI Lower AUC 95% CI Upper Sensitivity Specificity Optimal Threshold
#> ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
#> AI_score 0.6937368 0.6194502 0.7680235 0.8315789 0.5100000 0.38150000
#> human_score 0.6360526 0.5578771 0.7142281 0.4736842 0.7700000 0.58800000
#> biomarker1 0.6779474 0.6026435 0.7532512 0.7052632 0.6200000 -0.07500000
#> ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
#>
#>
#> Model Comparison (DeLong Test)
#> ───────────────────────────────────────────────────────────────────────────────────────────
#> Comparison AUC (Model 1) AUC (Model 2) Difference p-value
#> ───────────────────────────────────────────────────────────────────────────────────────────
#> AI_score vs human_score 0.6937368 0.6360526 0.05768421 0.0063090
#> AI_score vs biomarker1 0.6937368 0.6779474 0.01578947 0.7837605
#> human_score vs biomarker1 0.6360526 0.6779474 -0.04189474 0.4655825
#> ───────────────────────────────────────────────────────────────────────────────────────────
#>
#>
#> Cross-Validation Performance
#> ────────────────────────────────────────────────────────────────────────────────────────────────────────────────
#> Predictor Mean AUC SD AUC Mean Sensitivity SD Sensitivity Mean Specificity SD Specificity
#> ────────────────────────────────────────────────────────────────────────────────────────────────────────────────
#> ────────────────────────────────────────────────────────────────────────────────────────────────────────────────
#>
# }