Decision Tree Graph Analysis: Comprehensive Guide and Testing
ClinicoPath Package
2025-06-30
Source:vignettes/meddecide-30-decisiontree-analysis-vignette-legacy.Rmd
meddecide-30-decisiontree-analysis-vignette-legacy.Rmd
Introduction to Decision Tree Graph Analysis
This vignette provides a comprehensive guide to using the Decision Tree Graph module in the ClinicoPath package. The module creates professional decision tree visualizations for cost-effectiveness analysis with typical decision nodes (squares), chance nodes (circles), and terminal nodes (triangles).
Key Features
- Multiple Tree Types: Simple decision trees, Markov models, and cost-effectiveness trees
- Node Visualization: Customizable shapes and colors for different node types
- Economic Analysis: Expected value calculations, ICERs, and net benefit analysis
- Sensitivity Analysis: One-way sensitivity analysis with tornado diagrams
- Flexible Layouts: Horizontal, vertical, and radial tree orientations
Test Datasets
We’ll use several comprehensive test datasets to demonstrate all features:
# Load test datasets
data("basic_decision_data") # Basic treatment comparison
data("markov_decision_data") # Markov model data
data("pharma_decision_data") # Drug comparison study
data("screening_decision_data") # Screening program analysis
data("minimal_test_data") # Simple functionality test
data("edge_case_data") # Edge cases and error testing
# Display dataset summaries
cat("Basic Decision Data:", nrow(basic_decision_data), "rows,", ncol(basic_decision_data), "columns\n")
#> Basic Decision Data: 100 rows, 18 columns
cat("Markov Decision Data:", nrow(markov_decision_data), "rows,", ncol(markov_decision_data), "columns\n")
#> Markov Decision Data: 200 rows, 19 columns
cat("Pharma Decision Data:", nrow(pharma_decision_data), "rows,", ncol(pharma_decision_data), "columns\n")
#> Pharma Decision Data: 150 rows, 19 columns
cat("Screening Decision Data:", nrow(screening_decision_data), "rows,", ncol(screening_decision_data), "columns\n")
#> Screening Decision Data: 120 rows, 24 columns
Basic Usage Examples
Example 1: Simple Treatment Comparison
Let’s start with a basic treatment comparison using the minimal test dataset:
# Examine the minimal test data structure
head(minimal_test_data)
#> # A tibble: 6 × 9
#> id treatment prob1 prob2 cost1 cost2 utility1 utility2 outcome
#> <int> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 1 A 0.7 0.3 1000 1500 0.8 0.6 Failure
#> 2 2 A 0.8 0.2 1200 1800 0.85 0.65 Failure
#> 3 3 A 0.6 0.4 800 1200 0.75 0.55 Success
#> 4 4 A 0.9 0.1 1500 2250 0.9 0.7 Failure
#> 5 5 A 0.75 0.25 1100 1650 0.82 0.62 Failure
#> 6 6 B 0.5 0.5 2000 3000 0.6 0.4 Success
# This dataset contains:
# - treatment: Decision variable (A vs B)
# - prob1, prob2: Probability variables
# - cost1, cost2: Cost variables
# - utility1, utility2: Utility variables
# - outcome: Outcome variable
Creating a Basic Decision Tree
# In jamovi, you would:
# 1. Load the minimal_test_data
# 2. Go to meddecide > Decision > Decision Tree Graph
# 3. Set variables:
# - Decisions: treatment
# - Probabilities: prob1, prob2
# - Costs: cost1, cost2
# - Utilities: utility1, utility2
# - Outcomes: outcome
# 4. Choose layout: horizontal
# 5. Enable: Show Probabilities, Show Costs, Show Utilities
Example 2: Pharmaceutical Cost-Effectiveness Analysis
Using the comprehensive pharmaceutical dataset:
# Examine pharmaceutical data structure
head(pharma_decision_data)
#> # A tibble: 6 × 19
#> study_id drug_regimen dosing_strategy administration prob_response
#> <int> <chr> <chr> <chr> <dbl>
#> 1 1 Drug B High Dose Subcutaneous 0.670
#> 2 2 Drug C Personalized Dose Oral 0.830
#> 3 3 Drug A Standard Dose Subcutaneous 0.632
#> 4 4 Drug B High Dose IV 0.541
#> 5 5 Drug B High Dose Oral 0.649
#> 6 6 Standard Care High Dose Subcutaneous 0.426
#> # ℹ 14 more variables: prob_severe_ae <dbl>, prob_discontinuation <dbl>,
#> # cost_drug_per_cycle <dbl>, cost_administration <dbl>,
#> # cost_monitoring <dbl>, cost_ae_management <dbl>, cost_progression <dbl>,
#> # utility_response <dbl>, utility_stable <dbl>, utility_progression <dbl>,
#> # utility_severe_ae <dbl>, progression_free_survival <dbl>,
#> # overall_survival <dbl>, quality_of_life_score <dbl>
# Key variables:
# - drug_regimen: Main decision variable (4 treatment options)
# - dosing_strategy: Secondary decision variable
# - prob_response, prob_severe_ae: Probability variables
# - cost_drug_per_cycle, cost_administration: Cost variables
# - utility_response, utility_stable: Utility variables
Advanced Configuration
# Advanced jamovi configuration:
# 1. Decisions: drug_regimen, dosing_strategy
# 2. Probabilities: prob_response, prob_severe_ae, prob_discontinuation
# 3. Costs: cost_drug_per_cycle, cost_administration, cost_monitoring
# 4. Utilities: utility_response, utility_stable, utility_progression
# 5. Tree Type: Cost-Effectiveness Tree
# 6. Layout: Horizontal
# 7. Color Scheme: Medical Theme
# 8. Enable Expected Values calculation
# 9. Set discount rate: 3%
# 10. Time horizon: 5 years
Testing All Module Arguments
Tree Type Options
Simple Decision Tree
- Best for basic treatment comparisons
- Minimal probability calculations
- Focus on direct outcomes
Markov Model Tree
- For multi-state disease progression
- Time-dependent transitions
- Suitable for chronic diseases
Cost-Effectiveness Tree
- Comprehensive economic evaluation
- ICER calculations
- Net benefit analysis
# Test each tree type with markov_decision_data:
# 1. Simple Decision Tree
# - Focus on treatment_strategy decisions
# - Use basic probabilities
# 2. Markov Model Tree
# - Include transition probabilities
# - Multi-cycle analysis
# - State-specific costs and utilities
# 3. Cost-Effectiveness Tree
# - Full economic evaluation
# - All cost components
# - Utility measurements
Layout Options Testing
# Test all layout orientations:
# 1. Horizontal Layout (default)
# - Tree flows left to right
# - Decision node on left, outcomes on right
# - Best for simple trees
# 2. Vertical Layout
# - Tree flows top to bottom
# - Good for presentation slides
# - Compact for wide trees
# 3. Radial Layout
# - Tree radiates from center
# - Artistic presentation
# - Good for complex trees with many branches
Display Options Testing
Node Shape Configuration
# Test node shape options:
# 1. Show Node Shapes = TRUE (default)
# - Squares for decision nodes
# - Circles for chance nodes
# - Triangles for terminal nodes
# - Clear visual distinction
# 2. Show Node Shapes = FALSE
# - All nodes as circles
# - Color coding only
# - Simpler appearance
Label Display Options
# Test label configurations:
# 1. Show Probabilities = TRUE
# - Display "p=0.75" on branches
# - Help interpret chance outcomes
# - Essential for probability assessment
# 2. Show Costs = TRUE
# - Display "Cost: $15,000" on terminal nodes
# - Critical for cost-effectiveness
# - Currency formatting
# 3. Show Utilities = TRUE
# - Display "Utility: 0.85" on terminal nodes
# - Quality of life measures
# - QALY calculations
# 4. Show Node Labels = TRUE
# - Descriptive text on nodes
# - Treatment names, outcome descriptions
# - Improve interpretation
# 5. Show Branch Labels = TRUE
# - Text on connecting lines
# - Probability values, condition names
# - Decision pathway clarity
Color Scheme Testing
# Test all color schemes:
# 1. Default Theme
# - Green decisions, blue chance, orange terminals
# - Standard clinical colors
# - Good general purpose
# 2. Colorblind Friendly
# - Carefully selected colors
# - Accessible to colorblind users
# - High contrast options
# 3. Medical Theme
# - Professional medical colors
# - Suitable for clinical presentations
# - Conservative appearance
# 4. Economic Theme
# - Colors representing cost/benefit
# - Green for savings, red for costs
# - Financial analysis focus
Analysis Options Testing
Expected Value Calculations
# Test expected value features:
# 1. Calculate Expected Values = TRUE
# - Automatic cost and utility calculations
# - Probability-weighted outcomes
# - Decision tree rollback analysis
# Configuration with screening_decision_data:
# - Multiple screening strategies
# - Cost per test, diagnostic workup
# - Utilities for different health states
# - Life years gained calculations
Sensitivity Analysis
# Test sensitivity analysis options:
# 1. Sensitivity Analysis = TRUE
# - One-way sensitivity analysis
# - Parameter variation testing
# - Robust decision making
# 2. Tornado Diagram = TRUE
# - Visual sensitivity results
# - Parameters ranked by impact
# - Range of outcomes displayed
# Using pharma_decision_data for sensitivity:
# - Vary drug efficacy (prob_response)
# - Vary cost parameters
# - Vary utility values
# - Assess decision robustness
Economic Parameters
# Test economic parameter settings:
# 1. Discount Rate testing:
# - 0% (no discounting)
# - 3% (standard health economics)
# - 5% (conservative approach)
# - 7% (high discount rate)
# 2. Time Horizon testing:
# - 1 year (short-term analysis)
# - 5 years (medium-term)
# - 10 years (long-term)
# - Lifetime (maximum horizon)
# Impact on net present value calculations
# Future cost and benefit discounting
Output Tables Testing
Summary Table Features
# Test summary table with basic_decision_data:
# Summary Table includes:
# - Strategy names
# - Expected costs (discounted)
# - Expected utilities (QALYs)
# - Incremental Cost-Effectiveness Ratios (ICERs)
# - Net benefit at willingness-to-pay thresholds
# Currency formatting for costs
# Decimal precision for utilities
# ICER calculation accuracy
Node Details Table
# Test node details table:
# Node Table includes:
# - Node ID (unique identifier)
# - Node Type (decision/chance/terminal)
# - Node Label (descriptive text)
# - Probability values
# - Cost values
# - Utility values
# Useful for:
# - Debugging tree structure
# - Verifying calculations
# - Detailed documentation
Edge Cases and Error Handling
Missing Data Testing
# Test with edge_case_data containing missing values
head(edge_case_data)
#> # A tibble: 6 × 13
#> id treatment_missing prob_zero prob_one prob_negative prob_over_one
#> <int> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 1 Treatment 0 1 0.5 0.5
#> 2 2 Treatment 0 1 0.5 0.5
#> 3 3 Treatment 0 1 0.5 0.5
#> 4 4 Treatment 0 1 0.5 0.5
#> 5 5 Treatment 0 1 0.5 0.5
#> 6 6 Treatment 0 1 0.5 0.5
#> # ℹ 7 more variables: cost_zero <dbl>, cost_negative <dbl>,
#> # cost_very_high <dbl>, utility_negative <dbl>, utility_over_one <dbl>,
#> # single_treatment <chr>, many_categories <chr>
# Edge cases include:
# - Missing treatment assignments
# - Zero probabilities
# - Negative costs
# - Utilities outside 0-1 range
# - Single category variables
# Test error handling:
# 1. Missing Required Variables
# - No decision variables specified
# - No cost or utility data
# - Expected: Informative error message
# 2. Invalid Probability Values
# - Probabilities < 0 or > 1
# - Expected: Data validation warning
# 3. Negative Costs
# - Cost values < 0
# - Expected: Warning or automatic correction
# 4. Invalid Utility Values
# - Utilities < 0 or > 1
# - Expected: Range validation
# 5. Insufficient Data
# - Fewer than 2 decision options
# - Expected: Minimum data requirement message
Performance Testing
Large Dataset Handling
# Test with large datasets:
# 1. Markov data (200 scenarios)
# - Complex multi-state model
# - Many transition probabilities
# - State-specific costs/utilities
# 2. Screening data (120 scenarios)
# - Multiple screening strategies
# - Population-specific parameters
# - Test performance characteristics
# Performance metrics:
# - Tree generation time
# - Plot rendering speed
# - Memory usage
# - Table population speed
Complex Tree Structures
# Test complex tree configurations:
# 1. Multiple Decision Variables
# - Primary and secondary decisions
# - Nested decision structures
# - Interaction effects
# 2. Many Outcome Branches
# - Multiple chance nodes
# - Numerous terminal outcomes
# - Complex probability trees
# 3. Deep Tree Hierarchies
# - Multi-level decisions
# - Sequential choices
# - Time-dependent paths
Comparison with Other Methods
Validation Against Manual Calculations
# Validate expected value calculations:
# Manual calculation example with minimal_test_data:
# Treatment A:
# - Expected Cost = (prob1 * cost1) + (prob2 * cost2)
# - Expected Utility = (prob1 * utility1) + (prob2 * utility2)
# Treatment B:
# - Similar calculations
# - Compare with module output
# ICER calculation:
# - (Cost_B - Cost_A) / (Utility_B - Utility_A)
# - Verify against summary table
Sensitivity Analysis Validation
# Validate sensitivity analysis:
# 1. One-way sensitivity
# - Manually vary single parameters
# - Compare impact on outcomes
# - Verify tornado diagram rankings
# 2. Two-way sensitivity
# - Vary two parameters simultaneously
# - Create sensitivity matrices
# - Identify interaction effects
# 3. Probabilistic sensitivity
# - Monte Carlo simulation
# - Parameter uncertainty modeling
# - Confidence interval generation
Best Practices and Recommendations
Data Preparation
- Variable Naming: Use descriptive names for clarity
- Data Validation: Check ranges and missing values
- Probability Constraints: Ensure probabilities sum to 1
- Cost Standardization: Use consistent currency and time periods
- Utility Scales: Maintain 0-1 range for utilities
Troubleshooting Common Issues
Data Issues
# Common problems and solutions:
# 1. "No data provided for analysis"
# Solution: Check data loading and variable selection
# 2. "Missing required variables"
# Solution: Specify at least one decision, cost, or utility variable
# 3. Probabilities don't sum to 1
# Solution: Normalize probability variables
# 4. Negative costs or utilities
# Solution: Check data entry and transformations
# 5. Tree too complex to display
# Solution: Simplify structure or use subsets
Visualization Issues
# Visualization problems:
# 1. Overlapping node labels
# Solution: Reduce label length or change layout
# 2. Tree too wide/tall
# Solution: Adjust layout orientation
# 3. Colors not distinguishable
# Solution: Change color scheme
# 4. Missing plot elements
# Solution: Check display option settings
# 5. Poor plot quality
# Solution: Adjust figure dimensions
Advanced Applications
Conclusion
The Decision Tree Graph module provides a comprehensive toolkit for cost-effectiveness analysis in healthcare. Key strengths include:
- Flexibility: Multiple tree types and configurations
-
Visual Appeal: Professional publication-ready
graphics
- Economic Rigor: Standard health economics calculations
- User-Friendly: Intuitive jamovi interface
- Comprehensive Output: Tables, plots, and sensitivity analysis
This vignette has demonstrated extensive testing across various scenarios, data types, and configuration options. The module handles both simple and complex decision problems while maintaining computational efficiency and visual clarity.
For additional support and examples, consult the ClinicoPath documentation and consider the specific requirements of your decision analysis context.
Note: This vignette demonstrates the capabilities and testing approaches for the Decision Tree Graph module. Actual analysis results will depend on your specific data and research questions. Always validate calculations and interpretations within your clinical and economic context.