Skip to contents

Usage

clinicalprediction(
  data,
  outcome_var,
  predictor_vars,
  patient_id,
  time_var,
  event_var,
  stratify_var,
  model_type = "random_forest",
  problem_type = "classification",
  outcome_type = "binary",
  feature_selection = TRUE,
  selection_method = "recursive_fe",
  feature_engineering = TRUE,
  handle_missing = "mice_imputation",
  train_proportion = 0.7,
  validation_method = "cv_10fold",
  hyperparameter_tuning = TRUE,
  tuning_method = "random_search",
  random_seed = 42,
  interpretability = TRUE,
  shap_analysis = TRUE,
  lime_analysis = FALSE,
  permutation_importance = TRUE,
  partial_dependence = TRUE,
  individual_explanations = FALSE,
  n_explanations = 10,
  performance_metrics = TRUE,
  calibration_analysis = TRUE,
  clinical_metrics = TRUE,
  roc_analysis = TRUE,
  threshold_optimization = TRUE,
  compare_models = FALSE,
  baseline_models = TRUE,
  ensemble_models = FALSE,
  risk_stratification = TRUE,
  n_risk_groups = 3,
  nomogram = TRUE,
  decision_curve = TRUE,
  external_validation = TRUE,
  bootstrap_ci = 1000,
  stability_analysis = TRUE,
  bias_analysis = TRUE,
  detailed_output = TRUE,
  clinical_report = TRUE,
  save_model = FALSE,
  export_predictions = FALSE,
  regulatory_documentation = TRUE
)

Arguments

data

the data as a data frame

outcome_var

Target variable for prediction (binary, continuous, or survival)

predictor_vars

Variables to use as predictors in the model

patient_id

Patient identifier for tracking predictions

time_var

Time to event variable for survival prediction models

event_var

Event indicator for survival prediction models

stratify_var

Variable for stratified sampling and validation

model_type

Type of machine learning model to fit

problem_type

Type of prediction problem

outcome_type

Specific type of outcome variable

feature_selection

Perform automated feature selection

selection_method

Method for automatic feature selection

feature_engineering

Perform automated feature engineering

handle_missing

Method for handling missing data

train_proportion

Proportion of data for training (70\

validation_methodMethod for model validation

hyperparameter_tuningPerform automated hyperparameter optimization

tuning_methodMethod for hyperparameter tuning

random_seedRandom seed for reproducibility

interpretabilityGenerate model interpretability analysis

shap_analysisGenerate SHAP (SHapley Additive exPlanations) values

lime_analysisGenerate LIME (Local Interpretable Model-agnostic Explanations)

permutation_importanceCalculate permutation feature importance

partial_dependenceGenerate partial dependence plots for key features

individual_explanationsExplain individual predictions using SHAP/LIME

n_explanationsNumber of individual predictions to explain in detail

performance_metricsCalculate comprehensive performance metrics

calibration_analysisAssess model calibration

clinical_metricsCalculate clinical decision analysis metrics

roc_analysisPerform ROC curve analysis with confidence intervals

threshold_optimizationOptimize prediction threshold for clinical use

compare_modelsCompare multiple model types

baseline_modelsInclude simple baseline models for comparison

ensemble_modelsCreate ensemble of best performing models

risk_stratificationCreate risk stratification groups

n_risk_groupsNumber of risk stratification groups (e.g., low/medium/high)

nomogramCreate clinical nomogram for risk calculation

decision_curvePerform decision curve analysis for clinical utility

external_validationPrepare model for external validation

bootstrap_ciNumber of bootstrap samples for confidence intervals

stability_analysisAssess model stability across different samples

bias_analysisAnalyze model bias across demographic groups

detailed_outputInclude detailed model diagnostics and explanations

clinical_reportGenerate clinical interpretation report

save_modelSave trained model for future predictions

export_predictionsExport individual predictions with probabilities

regulatory_documentationInclude documentation for regulatory submission

A results object containing:

results$overviewa table
results$dataset_infoa table
results$feature_selection_resultsa table
results$feature_engineering_summarya table
results$performance_summarya table
results$classification_metricsa table
results$survival_metricsa table
results$feature_importancea table
results$shap_summarya table
results$individual_explanationsa table
results$model_comparisona table
results$risk_stratificationa table
results$decision_curve_analysisa table
results$cross_validation_resultsa table
results$stability_analysisa table
results$bias_fairness_analysisa table
results$hyperparameter_resultsa table
results$clinical_interpretationa table
results$regulatory_summarya table
results$roc_plotan image
results$calibration_plotan image
results$feature_importance_plotan image
results$shap_summary_plotan image
results$shap_dependence_plotan image
results$decision_curve_plotan image
results$risk_distribution_plotan image
results$model_comparison_plotan image
results$stability_plotan image
results$nomogram_plotan image
Tables can be converted to data frames with asDF or as.data.frame. For example:results$overview$asDFas.data.frame(results$overview) Develop and validate clinical prediction models using machine learning algorithms with interpretability analysis. Includes comprehensive model comparison, feature selection, cross-validation, and explainable AI through SHAP and LIME methods. Designed for clinical research applications with regulatory compliance features for model validation and documentation. data('clinical_data')clinicalprediction( data = clinical_data, outcome_var = "mortality", predictor_vars = c("age", "biomarker", "stage"), model_type = "random_forest", interpretability = TRUE, validation_method = "cv_10fold" )