Decision Tree and Markov Chain Analysis Examples

Overview

This vignette provides comprehensive examples explaining both decision tree and Markov chain analysis types implemented in the ClinicoPath jamovi module.

Decision Tree Analysis Example

Scenario: Acute Appendicitis Treatment Decision

Clinical Question: Should a patient with suspected appendicitis receive immediate surgery or conservative treatment?

Creating Example Data

# Create decision tree example data
set.seed(123)
n_patients <- 100

appendicitis_decision_data <- data.frame(
  patient_id = 1:n_patients,

  # DECISION NODES (Square boxes)
  treatment_choice = sample(c("Immediate Surgery", "Conservative Treatment"),
                           n_patients, replace = TRUE),

  # PROBABILITY VARIABLES (for chance nodes - circles)
  prob_appendicitis_confirmed = runif(n_patients, 0.7, 0.9),     # Probability patient actually has appendicitis
  prob_surgery_success = runif(n_patients, 0.95, 0.99),          # Surgery success rate
  prob_conservative_success = runif(n_patients, 0.6, 0.8),       # Conservative treatment success
  prob_complications_surgery = runif(n_patients, 0.02, 0.08),    # Surgery complications
  prob_complications_conservative = runif(n_patients, 0.15, 0.25), # Conservative complications

  # COST VARIABLES (outcomes)
  cost_surgery = rnorm(n_patients, 12000, 2000),                 # Surgery costs
  cost_conservative = rnorm(n_patients, 3000, 500),              # Conservative treatment costs
  cost_complications = rnorm(n_patients, 8000, 1500),            # Complication management costs
  cost_failed_conservative = rnorm(n_patients, 15000, 2500),     # Emergency surgery after failed conservative

  # UTILITY VARIABLES (quality of life outcomes)
  utility_success = runif(n_patients, 0.95, 1.0),               # Full recovery
  utility_minor_complications = runif(n_patients, 0.8, 0.9),    # Recovery with minor issues
  utility_major_complications = runif(n_patients, 0.6, 0.8),    # Recovery with major issues

  # OUTCOME VARIABLES (terminal nodes - triangles)
  clinical_outcome = sample(c("Complete Recovery", "Minor Complications", "Major Complications"),
                           n_patients, replace = TRUE, prob = c(0.8, 0.15, 0.05))
)

# Display first few rows
head(appendicitis_decision_data, 5)
#>   patient_id       treatment_choice prob_appendicitis_confirmed
#> 1          1      Immediate Surgery                   0.8199978
#> 2          2      Immediate Surgery                   0.7665647
#> 3          3      Immediate Surgery                   0.7977226
#> 4          4 Conservative Treatment                   0.8908948
#> 5          5      Immediate Surgery                   0.7965805
#>   prob_surgery_success prob_conservative_success prob_complications_surgery
#> 1            0.9595490                 0.7569151                 0.07916326
#> 2            0.9884944                 0.6018860                 0.02822405
#> 3            0.9740546                 0.7558132                 0.07431857
#> 4            0.9706012                 0.7458781                 0.05457811
#> 5            0.9661029                 0.7260264                 0.04372693
#>   prob_complications_conservative cost_surgery cost_conservative
#> 1                       0.1853606    10569.516          2963.222
#> 2                       0.1866441    10494.622          2415.674
#> 3                       0.1787100    10122.923          2682.626
#> 4                       0.1579973     9894.973          2985.579
#> 5                       0.1865454    11125.681          3335.348
#>   cost_complications cost_failed_conservative utility_success
#> 1           7097.161                 17685.03       0.9616620
#> 2           6509.452                 14931.63       0.9615327
#> 3           9540.178                 14916.67       0.9530863
#> 4           9126.592                 11209.83       0.9748559
#> 5           5736.250                 16975.96       0.9622063
#>   utility_minor_complications utility_major_complications    clinical_outcome
#> 1                   0.8938028                   0.7278372   Complete Recovery
#> 2                   0.8988003                   0.6249645   Complete Recovery
#> 3                   0.8456320                   0.6510532   Complete Recovery
#> 4                   0.8230615                   0.7641149 Minor Complications
#> 5                   0.8695489                   0.7607561 Minor Complications

Decision Tree Structure

The decision tree has the following structure:

DECISION NODE (Square): Treatment Choice
- Option A: Immediate Surgery
- Option B: Conservative Treatment
CHANCE NODES (Circles): Probabilistic Outcomes
- For Surgery:
  - Success (95-99%): Low cost, high utility
  - Complications (2-8%): Higher cost, lower utility
- For Conservative:
  - Success (60-80%): Low cost, high utility
  - Failure (20-40%): Requires emergency surgery
TERMINAL NODES (Triangles): Final Outcomes
- Each path ends with:
  - Cost: $3,000 - $20,000
  - Utility: 0.6 - 1.0 QALYs

Expected Value Calculations

# Calculate expected values for decision tree
surgery_expected_cost <- mean(appendicitis_decision_data$cost_surgery +
                             appendicitis_decision_data$prob_complications_surgery *
                             appendicitis_decision_data$cost_complications)

conservative_expected_cost <- mean(appendicitis_decision_data$cost_conservative +
                                  (1 - appendicitis_decision_data$prob_conservative_success) *
                                  appendicitis_decision_data$cost_failed_conservative)

surgery_expected_utility <- mean(appendicitis_decision_data$utility_success *
                                appendicitis_decision_data$prob_surgery_success +
                                appendicitis_decision_data$utility_minor_complications *
                                appendicitis_decision_data$prob_complications_surgery)

conservative_expected_utility <- mean(appendicitis_decision_data$utility_success *
                                     appendicitis_decision_data$prob_conservative_success +
                                     appendicitis_decision_data$utility_major_complications *
                                     (1 - appendicitis_decision_data$prob_conservative_success))

# Calculate ICER
incremental_cost <- surgery_expected_cost - conservative_expected_cost
incremental_utility <- surgery_expected_utility - conservative_expected_utility
icer <- incremental_cost / incremental_utility

# Display results
cat("DECISION TREE ANALYSIS RESULTS:\n")
#> DECISION TREE ANALYSIS RESULTS:
cat("===============================\n")
#> ===============================
cat("Surgery Strategy:\n")
#> Surgery Strategy:
cat("  Expected Cost: $", round(surgery_expected_cost, 0), "\n")
#>   Expected Cost: $ 12315
cat("  Expected Utility:", round(surgery_expected_utility, 3), "QALYs\n")
#>   Expected Utility: 0.989 QALYs
cat("\n")
cat("Conservative Strategy:\n")
#> Conservative Strategy:
cat("  Expected Cost: $", round(conservative_expected_cost, 0), "\n")
#>   Expected Cost: $ 7454
cat("  Expected Utility:", round(conservative_expected_utility, 3), "QALYs\n")
#>   Expected Utility: 0.895 QALYs
cat("\n")
cat("Cost-Effectiveness Analysis:\n")
#> Cost-Effectiveness Analysis:
cat("  Incremental Cost: $", round(incremental_cost, 0), "\n")
#>   Incremental Cost: $ 4861
cat("  Incremental Utility:", round(incremental_utility, 3), "QALYs\n")
#>   Incremental Utility: 0.094 QALYs
cat("  ICER: $", round(icer, 0), "per QALY\n")
#>   ICER: $ 51744 per QALY

if (icer < 50000) {
  cat("  ✓ Surgery is cost-effective (ICER < $50,000/QALY)\n")
} else {
  cat("  ⚠ Surgery may not be cost-effective (ICER > $50,000/QALY)\n")
}
#>   ⚠ Surgery may not be cost-effective (ICER > $50,000/QALY)

Decision Tree Interpretation

Decision trees are ideal for:

One-time decisions with immediate outcomes
Comparing 2-3 distinct treatment strategies
Situations where timing is not critical
When outcomes occur relatively quickly (days to months)
Point-in-time cost-effectiveness analysis

Markov Chain Analysis Example

Scenario: Chronic Heart Disease Management

Clinical Question: What is the long-term cost-effectiveness of different heart disease management strategies?

Creating Markov Data

# Create Markov chain example data
set.seed(456)
n_strategies <- 150

# First create basic structure
heart_disease_markov_data <- data.frame(
  strategy_id = 1:n_strategies,

  # DECISION VARIABLES
  management_strategy = sample(c("Standard Care", "Intensive Monitoring", "Preventive Surgery"),
                              n_strategies, replace = TRUE),
  patient_risk_category = sample(c("Low Risk", "Moderate Risk", "High Risk"),
                                n_strategies, replace = TRUE, prob = c(0.4, 0.4, 0.2))
)

# Add transition probabilities
heart_disease_markov_data$prob_asymp_to_symp <- case_when(
  heart_disease_markov_data$management_strategy == "Standard Care" ~ runif(n_strategies, 0.08, 0.12),
  heart_disease_markov_data$management_strategy == "Intensive Monitoring" ~ runif(n_strategies, 0.05, 0.08),
  heart_disease_markov_data$management_strategy == "Preventive Surgery" ~ runif(n_strategies, 0.02, 0.05)
)

heart_disease_markov_data$prob_asymp_to_death <- runif(n_strategies, 0.01, 0.02)

# From Symptomatic state
heart_disease_markov_data$prob_symp_to_hf <- case_when(
  heart_disease_markov_data$management_strategy == "Standard Care" ~ runif(n_strategies, 0.15, 0.25),
  heart_disease_markov_data$management_strategy == "Intensive Monitoring" ~ runif(n_strategies, 0.10, 0.18),
  heart_disease_markov_data$management_strategy == "Preventive Surgery" ~ runif(n_strategies, 0.05, 0.12)
)

heart_disease_markov_data$prob_symp_to_death <- runif(n_strategies, 0.02, 0.04)

# From Heart Failure state
heart_disease_markov_data$prob_hf_to_death <- runif(n_strategies, 0.12, 0.20)

# STATE-SPECIFIC ANNUAL COSTS
heart_disease_markov_data$cost_asymptomatic <- case_when(
  heart_disease_markov_data$management_strategy == "Standard Care" ~ rnorm(n_strategies, 2000, 300),
  heart_disease_markov_data$management_strategy == "Intensive Monitoring" ~ rnorm(n_strategies, 4000, 500),
  heart_disease_markov_data$management_strategy == "Preventive Surgery" ~ rnorm(n_strategies, 8000, 1000)
)

heart_disease_markov_data$cost_symptomatic <- rnorm(n_strategies, 12000, 2000)
heart_disease_markov_data$cost_heart_failure <- rnorm(n_strategies, 35000, 5000)
heart_disease_markov_data$cost_death <- rep(0, n_strategies)  # No ongoing costs after death

# STATE-SPECIFIC ANNUAL UTILITIES (Quality of Life)
heart_disease_markov_data$utility_asymptomatic <- runif(n_strategies, 0.90, 0.95)
heart_disease_markov_data$utility_symptomatic <- runif(n_strategies, 0.70, 0.80)
heart_disease_markov_data$utility_heart_failure <- runif(n_strategies, 0.45, 0.60)
heart_disease_markov_data$utility_death <- rep(0, n_strategies)  # No utility after death

# Display first few rows
head(heart_disease_markov_data, 5)
#>   strategy_id  management_strategy patient_risk_category prob_asymp_to_symp
#> 1           1        Standard Care         Moderate Risk         0.10689583
#> 2           2        Standard Care              Low Risk         0.08634853
#> 3           3   Preventive Surgery         Moderate Risk         0.04734300
#> 4           4 Intensive Monitoring              Low Risk         0.05328650
#> 5           5        Standard Care              Low Risk         0.10740397
#>   prob_asymp_to_death prob_symp_to_hf prob_symp_to_death prob_hf_to_death
#> 1          0.01404140      0.18689151         0.03364169        0.1552658
#> 2          0.01487426      0.16608326         0.02370983        0.1613420
#> 3          0.01075082      0.06126667         0.03330762        0.1503590
#> 4          0.01182783      0.14630665         0.03988227        0.1420213
#> 5          0.01183278      0.18365249         0.02973036        0.1467501
#>   cost_asymptomatic cost_symptomatic cost_heart_failure cost_death
#> 1          2072.980        12282.412           39063.99          0
#> 2          1844.419         9932.018           39444.66          0
#> 3          7538.227        15406.619           40683.96          0
#> 4          3622.419        11813.663           31744.42          0
#> 5          1708.996        10052.269           31720.16          0
#>   utility_asymptomatic utility_symptomatic utility_heart_failure utility_death
#> 1            0.9041736           0.7071496             0.4531630             0
#> 2            0.9414969           0.7188245             0.4938009             0
#> 3            0.9272197           0.7512690             0.4734708             0
#> 4            0.9455013           0.7202432             0.5545275             0
#> 5            0.9349304           0.7587741             0.4832054             0

Markov Chain Structure

HEALTH STATES (Markov States):

Asymptomatic Heart Disease
Symptomatic Heart Disease
Heart Failure
Death (Absorbing State)

TRANSITION PATHWAYS:

Asymptomatic → Symptomatic → Heart Failure → Death
             ↘ Death        ↘ Death

Transition Matrix Example

# Create example transition matrix for Standard Care
states <- c("Asymptomatic", "Symptomatic", "Heart Failure", "Death")
trans_matrix_standard <- matrix(0, nrow = 4, ncol = 4)
rownames(trans_matrix_standard) <- states
colnames(trans_matrix_standard) <- states

# Fill transition matrix with average probabilities for Standard Care
standard_data <- heart_disease_markov_data[heart_disease_markov_data$management_strategy == "Standard Care", ]

trans_matrix_standard[1, 1] <- 1 - mean(standard_data$prob_asymp_to_symp) - mean(standard_data$prob_asymp_to_death)  # Stay asymptomatic
trans_matrix_standard[1, 2] <- mean(standard_data$prob_asymp_to_symp)  # Asymp to symptomatic
trans_matrix_standard[1, 4] <- mean(standard_data$prob_asymp_to_death)  # Asymp to death

trans_matrix_standard[2, 2] <- 1 - mean(standard_data$prob_symp_to_hf) - mean(standard_data$prob_symp_to_death)  # Stay symptomatic
trans_matrix_standard[2, 3] <- mean(standard_data$prob_symp_to_hf)  # Symp to heart failure
trans_matrix_standard[2, 4] <- mean(standard_data$prob_symp_to_death)  # Symp to death

trans_matrix_standard[3, 3] <- 1 - mean(standard_data$prob_hf_to_death)  # Stay in heart failure
trans_matrix_standard[3, 4] <- mean(standard_data$prob_hf_to_death)  # HF to death

trans_matrix_standard[4, 4] <- 1.0  # Death is absorbing

cat("EXAMPLE TRANSITION MATRIX (Standard Care):\n")
#> EXAMPLE TRANSITION MATRIX (Standard Care):
cat("=========================================\n")
#> =========================================
print(round(trans_matrix_standard, 3))
#>               Asymptomatic Symptomatic Heart Failure Death
#> Asymptomatic         0.884       0.100         0.000 0.015
#> Symptomatic          0.000       0.772         0.198 0.030
#> Heart Failure        0.000       0.000         0.845 0.155
#> Death                0.000       0.000         0.000 1.000
cat("\nRow sums (should equal 1.0):", round(rowSums(trans_matrix_standard), 3), "\n")
#> 
#> Row sums (should equal 1.0): 1 1 1 1

Markov Cohort Simulation

# Run Markov cohort simulation
num_cycles <- 20
cohort_trace <- matrix(0, nrow = num_cycles + 1, ncol = 4)
colnames(cohort_trace) <- states

# Initial distribution: Everyone starts asymptomatic
cohort_trace[1, 1] <- 1.0

# Run Markov simulation
for (cycle in 2:(num_cycles + 1)) {
  cohort_trace[cycle, ] <- cohort_trace[cycle - 1, ] %*% trans_matrix_standard
}

# Create summary table
trace_df <- data.frame(
  Year = 0:num_cycles,
  Asymptomatic = round(cohort_trace[, 1] * 100, 1),
  Symptomatic = round(cohort_trace[, 2] * 100, 1),
  Heart_Failure = round(cohort_trace[, 3] * 100, 1),
  Death = round(cohort_trace[, 4] * 100, 1)
)

# Show key time points
key_years <- c(1, 6, 11, 16, 21)  # 0, 5, 10, 15, 20 years
cat("Population distribution over time:\n")
#> Population distribution over time:
print(trace_df[key_years, ])
#>    Year Asymptomatic Symptomatic Heart_Failure Death
#> 1     0        100.0         0.0           0.0   0.0
#> 6     5         54.0        23.8          11.5  10.6
#> 11   10         29.2        19.4          21.3  30.1
#> 16   15         15.8        12.3          20.7  51.2
#> 21   20          8.5         7.1          15.9  68.4

Cost-Effectiveness Calculation

# Calculate costs and utilities for Standard Care
state_costs <- c(mean(standard_data$cost_asymptomatic),
                mean(standard_data$cost_symptomatic),
                mean(standard_data$cost_heart_failure),
                0)  # Death

state_utilities <- c(mean(standard_data$utility_asymptomatic),
                    mean(standard_data$utility_symptomatic),
                    mean(standard_data$utility_heart_failure),
                    0)  # Death

names(state_costs) <- states
names(state_utilities) <- states

cat("Annual costs per state:\n")
#> Annual costs per state:
print(round(state_costs, 0))
#>  Asymptomatic   Symptomatic Heart Failure         Death 
#>          1962         12246         35336             0
cat("\nAnnual utilities per state:\n")
#> 
#> Annual utilities per state:
print(round(state_utilities, 3))
#>  Asymptomatic   Symptomatic Heart Failure         Death 
#>         0.925         0.758         0.519         0.000

# Calculate cumulative discounted costs and utilities
discount_rate <- 0.03
cumulative_costs <- rep(0, num_cycles + 1)
cumulative_utilities <- rep(0, num_cycles + 1)

for (cycle in 2:(num_cycles + 1)) {
  # Calculate cycle costs and utilities
  cycle_cost <- sum(cohort_trace[cycle, ] * state_costs)
  cycle_utility <- sum(cohort_trace[cycle, ] * state_utilities)

  # Apply discounting
  discount_factor <- (1 + discount_rate)^(-(cycle - 1))

  cumulative_costs[cycle] <- cumulative_costs[cycle - 1] + cycle_cost * discount_factor
  cumulative_utilities[cycle] <- cumulative_utilities[cycle - 1] + cycle_utility * discount_factor
}

# Final results
final_cost <- cumulative_costs[num_cycles + 1]
final_qalys <- cumulative_utilities[num_cycles + 1]

cat("\nFINAL 20-YEAR RESULTS (Standard Care):\n")
#> 
#> FINAL 20-YEAR RESULTS (Standard Care):
cat("Total Lifetime Cost: $", round(final_cost, 0), "\n")
#> Total Lifetime Cost: $ 120561
cat("Total Lifetime QALYs:", round(final_qalys, 2), "\n")
#> Total Lifetime QALYs: 8.39
cat("Cost per QALY: $", round(final_cost / final_qalys, 0), "\n")
#> Cost per QALY: $ 14370

Markov Chain Interpretation

Markov chains are ideal for:

Chronic diseases with multiple stages
Long-term cost-effectiveness analysis (years to lifetime)
Disease progression modeling
Comparing interventions with different timing effects
Policy decisions affecting population health
When disease states change over time
Recurring decisions or ongoing treatments

When to Use Each Method

Comparison Table

comparison_table <- data.frame(
  Aspect = c("Time Horizon", "Disease Type", "Decision Complexity", "Outcomes",
             "Costs", "Best For", "Data Requirements", "Computational Needs"),
  Decision_Tree = c("Short-term (days-months)", "Acute conditions", "Simple (2-3 options)",
                   "One-time outcomes", "One-time costs", "Treatment selection",
                   "Probabilities, costs, utilities", "Low"),
  Markov_Chain = c("Long-term (years-lifetime)", "Chronic conditions", "Complex strategies",
                  "Recurring outcomes", "Ongoing costs", "Disease management",
                  "Transition probabilities, state costs", "Higher")
)

print(comparison_table)
#>                Aspect                   Decision_Tree
#> 1        Time Horizon        Short-term (days-months)
#> 2        Disease Type                Acute conditions
#> 3 Decision Complexity            Simple (2-3 options)
#> 4            Outcomes               One-time outcomes
#> 5               Costs                  One-time costs
#> 6            Best For             Treatment selection
#> 7   Data Requirements Probabilities, costs, utilities
#> 8 Computational Needs                             Low
#>                            Markov_Chain
#> 1            Long-term (years-lifetime)
#> 2                    Chronic conditions
#> 3                    Complex strategies
#> 4                    Recurring outcomes
#> 5                         Ongoing costs
#> 6                    Disease management
#> 7 Transition probabilities, state costs
#> 8                                Higher

Example Applications

DECISION TREES are best for:

Emergency treatment decisions (appendicitis, trauma)
Surgical vs non-surgical interventions
Diagnostic test decisions
Vaccination decisions
One-time screening decisions

MARKOV CHAINS are best for:

Chronic disease management (diabetes, heart disease)
Cancer progression and treatment
Addiction treatment programs
Preventive intervention policies
Healthcare resource planning
Long-term pharmaceutical studies

Combined Approaches

Some complex problems benefit from both:

Decision tree for initial treatment choice
Markov chain for long-term disease progression

Example: Cancer treatment selection (tree) + survival modeling (Markov)

Summary

This vignette demonstrates practical applications of both decision tree and Markov chain methods for medical decision analysis and cost-effectiveness research. The generated datasets can be used with the ClinicoPath jamovi module to perform these analyses interactively.

Generated Datasets

The example code creates two key datasets:

appendicitis_decision_data - Decision tree example
heart_disease_markov_data - Markov chain example

These datasets demonstrate realistic medical scenarios and can be used to practice both analysis types in jamovi.

Next Steps

Load these datasets into jamovi
Use the ClinicoPath decision analysis modules
Practice interpreting cost-effectiveness results
Apply these methods to your own research questions

ClinicoPath Development Team

2025-06-30