Usage
tree(
data,
vars,
facs,
target,
targetLevel,
train,
trainLevel,
imputeMissing = FALSE,
balanceClasses = FALSE,
scaleFeatures = FALSE,
clinicalMetrics = FALSE,
featureImportance = FALSE,
showInterpretation = FALSE,
showPlot = FALSE,
minCases = 10,
maxDepth = 4,
confidenceInterval = FALSE,
riskStratification = FALSE,
exportPredictions = FALSE,
clinicalContext = "diagnosis",
costRatio = 1,
prevalenceAdjustment = FALSE,
expectedPrevalence = 10
)
Arguments
- data
The data as a data frame containing clinical variables, biomarkers, and patient outcomes.
- vars
Continuous variables such as biomarker levels, age, laboratory values, or quantitative pathological measurements.
- facs
Categorical variables such as tumor grade, stage, histological type, or patient demographics.
- target
Primary outcome variable: disease status, treatment response, survival status, or diagnostic category.
- targetLevel
Level representing disease presence, positive outcome, or event of interest.
- train
Variable indicating training vs validation cohorts. If not provided, data will be split automatically.
- trainLevel
Level indicating the training/discovery cohort.
- imputeMissing
Impute missing values using medically appropriate methods (median within disease groups for continuous, mode for categorical).
- balanceClasses
Balance classes to handle rare diseases or imbalanced outcomes. Recommended for disease prevalence <20\
scaleFeaturesStandardize continuous variables (useful when combining biomarkers with different scales/units).
clinicalMetricsDisplay sensitivity, specificity, predictive values, likelihood ratios, and other clinical metrics.
featureImportanceIdentify most important clinical variables and biomarkers for the decision tree.
showInterpretationProvide clinical interpretation of results including diagnostic utility and clinical recommendations.
showPlotDisplay visual representation of the decision tree.
minCasesMinimum number of cases required in each terminal node (higher values prevent overfitting).
maxDepthMaximum depth of decision tree (deeper trees may overfit).
confidenceIntervalDisplay confidence intervals for performance metrics.
riskStratificationAnalyze risk stratification performance and create risk categories based on tree predictions.
exportPredictionsAdd predicted classifications and probabilities to the dataset.
clinicalContextClinical context affects interpretation thresholds and recommendations (e.g., screening requires high sensitivity).
costRatioRelative cost of missing a case vs false alarm. Higher values favor sensitivity over specificity.
prevalenceAdjustmentAdjust predictive values for expected disease prevalence in target population (different from study sample).
expectedPrevalenceExpected disease prevalence in target population for adjusted predictive value calculations.
A results object containing:
results$todo | a html | ||||
results$text1 | a preformatted | ||||
results$text2 | a preformatted | ||||
results$text2a | a preformatted | ||||
results$text2b | a preformatted | ||||
results$text3 | a preformatted | ||||
results$text4 | a html | ||||
results$dataQuality | a preformatted | ||||
results$missingDataReport | a table | ||||
results$modelSummary | a html | ||||
results$clinicalMetrics | a table | ||||
results$clinicalInterpretation | a html | ||||
results$featureImportance | a table | ||||
results$riskStratification | a table | ||||
results$confusionMatrix | a table | ||||
results$adjustedMetrics | a table | ||||
results$plot | an image | ||||
results$deploymentGuidelines | a html | ||||
results$predictions | an output | ||||
results$probabilities | an output |
asDF
or as.data.frame
. For example:results$missingDataReport$asDF
as.data.frame(results$missingDataReport)
Enhanced decision tree analysis for medical research, pathology and
oncology. Provides clinical performance metrics, handles missing data
appropriately, and offers interpretations relevant to medical
decision-making.
# Example for cancer diagnosis
data(cancer_biomarkers)
tree(
data = cancer_biomarkers,
vars = c("PSA", "age", "tumor_size"),
facs = c("grade", "stage"),
target = "diagnosis",
targetLevel = "cancer",
train = "cohort",
trainLevel = "discovery",
imputeMissing = TRUE,
balanceClasses = TRUE
)