Medical Decision Trees — treemedical • ClinicoPath

Simple decision tree analysis for medical research and clinical decision making. Uses enhanced CART algorithm optimized for healthcare applications with clinical validation, performance metrics, and medical interpretation guidelines.

Usage

treemedical(
  data,
  vars = NULL,
  facs = NULL,
  target,
  targetLevel,
  validation = "cv",
  cv_folds = 5,
  bootstrap_samples = 100,
  stratified_sampling = TRUE,
  holdout_split = 0.75,
  handle_missing = "remove",
  max_depth = 5,
  min_samples_split = 20,
  cost_complexity = 0.01,
  use_1se_rule = TRUE,
  clinical_context = "diagnosis",
  cost_sensitive = FALSE,
  fn_fp_ratio = 2,
  show_tree_plot = TRUE,
  show_performance_metrics = TRUE,
  show_confusion_matrix = TRUE,
  show_importance_plot = TRUE,
  show_clinical_interpretation = TRUE,
  set_seed = TRUE,
  seed_value = 42
)

Arguments

data: The data as a data frame containing clinical variables and outcomes.
vars: Continuous variables such as biomarker levels, age, laboratory values, or quantitative measurements.
facs: Categorical variables such as tumor grade, stage, histological type, or patient demographics.
target: Primary outcome variable: disease status, treatment response, or diagnostic category.
targetLevel: Level representing disease presence or positive outcome.
validation: Validation approach for assessing model performance.
cv_folds: Number of folds for cross-validation.
bootstrap_samples: Number of bootstrap samples for bootstrap validation.
stratified_sampling: Maintain class proportions in train/test splits.
holdout_split: Proportion of data for training in holdout validation (rest for testing).
handle_missing: How to handle missing values in predictor variables.
max_depth: Maximum depth of decision tree. Clinical trees typically 3-6 levels.
min_samples_split: Minimum number of samples required to split a node.
cost_complexity: Controls tree pruning - lower values create more complex trees.
use_1se_rule: Select simplest tree within 1 SE of optimal performance.
clinical_context: Clinical application context affects interpretation guidelines.
cost_sensitive: Consider different costs of false positives vs false negatives.
fn_fp_ratio: Relative cost of missing positive case vs false alarm.
show_tree_plot: Display visual representation of the decision tree.
show_performance_metrics: Display accuracy, sensitivity, specificity, AUC, and clinical metrics.
show_confusion_matrix: Display confusion matrix with clinical interpretations.
show_importance_plot: Display ranking of most important clinical variables.
show_clinical_interpretation: Display clinical interpretation and usage guidelines.
set_seed: Set random seed for reproducible results.
seed_value: Seed value for random number generation.

Value

A results object containing:

`results$instructions`					a html
`results$model_summary`					a html
`results$performance_table`					a table
`results$confusion_matrix`					a table
`results$variable_importance`					a table
`results$tree_plot`					an image
`results$importance_plot`					an image
`results$clinical_interpretation`					a html

Tables can be converted to data frames with asDF or as.data.frame. For example:

results$performance_table$asDF

as.data.frame(results$performance_table)

Examples

# Example for cancer diagnosis
data(cancer_data)
treemedical(
    data = cancer_data,
    vars = c("PSA", "age"),
    facs = c("grade", "stage"),
    target = "diagnosis",
    targetLevel = "cancer",
    validation = "cv",
    show_tree_plot = TRUE
)