Usage
survivalbart(
  data,
  time,
  event,
  predictors,
  strata,
  model_type = "aft",
  prior_distribution = "normal",
  n_trees = 200,
  alpha = 0.95,
  beta = 2,
  k = 2,
  q = 0.9,
  nu = 3,
  n_burn = 1000,
  n_post = 2000,
  n_thin = 1,
  n_chains = 1,
  variable_selection = TRUE,
  sparse_prior = FALSE,
  selection_alpha = 1,
  predict_times = "1, 2, 5",
  survival_quantiles = "0.25, 0.5, 0.75",
  credible_level = 0.95,
  cross_validation = FALSE,
  cv_folds = 5,
  convergence_diagnostics = TRUE,
  posterior_prediction = FALSE,
  show_model_summary = TRUE,
  show_variable_importance = TRUE,
  show_survival_summary = TRUE,
  show_convergence = TRUE,
  plot_survival_curves = TRUE,
  plot_variable_importance = TRUE,
  plot_trace = FALSE,
  plot_posterior_predictive = FALSE,
  plot_partial_dependence = FALSE,
  plot_interactions = FALSE,
  probit_link = FALSE,
  cure_fraction_prior = "uniform",
  adaptive_scaling = TRUE,
  parallel_chains = FALSE,
  memory_optimization = TRUE,
  random_seed = 123
)Arguments
- data
- The data as a data frame. 
- time
- Time to event variable (numeric). For right-censored data, this is the time from study entry to event or censoring. 
- event
- Event indicator variable. For survival analysis: 0 = censored, 1 = event. Binary coding required for current BART implementation. 
- predictors
- Variables to include in the BART survival model. Can include numeric, ordinal, and nominal variables. BART automatically handles variable types and interactions without preprocessing. 
- strata
- Optional stratification variable for stratified survival analysis. Creates separate BART models for each stratum. 
- model_type
- Type of survival model. AFT models log-survival times directly, PH models log-hazard ratios, cure models handle long-term survivors with mixture of cured and susceptible populations. 
- prior_distribution
- Prior distribution for error terms in AFT model. Normal corresponds to log-normal survival times, extreme value to Weibull survival, logistic to log-logistic survival. 
- n_trees
- Number of trees in the BART ensemble. More trees provide greater flexibility but increase computation time. 100-300 is typical range. 
- alpha
- Controls tree depth through splitting probability. Higher values favor shallower trees, lower values allow deeper trees and more complex interactions. 
- beta
- Controls tree depth through splitting probability formula. Higher values discourage deep splits, promoting simpler trees. 
- k
- Scale parameter for terminal node prior. Controls magnitude of individual tree contributions to overall ensemble. 
- q
- Quantile for error variance prior specification. Higher values assume less noise in the survival process. 
- nu
- Degrees of freedom for inverse chi-square prior on error variance. Lower values allow more uncertainty in variance estimation. 
- n_burn
- Number of burn-in MCMC iterations to discard. Should be sufficient for convergence to stationary distribution. 
- n_post
- Number of posterior MCMC iterations to keep for inference. More iterations provide more precise posterior estimates. 
- n_thin
- Keep every nth posterior sample to reduce autocorrelation. Higher values reduce effective sample size but may improve mixing. 
- n_chains
- Number of independent MCMC chains. Multiple chains enable convergence diagnostics and robust posterior inference. 
- variable_selection
- Include variable selection probabilities in BART model. Automatically identifies relevant predictors and reduces overfitting with irrelevant variables. 
- sparse_prior
- Use sparse Dirichlet prior for variable selection probabilities. Promotes sparser models with fewer active variables. 
- selection_alpha
- Concentration parameter for Dirichlet prior on variable selection probabilities. Higher values favor uniform selection. 
- predict_times
- Comma-separated list of time points for survival probability predictions. Empty string uses data-driven quantiles. 
- survival_quantiles
- Comma-separated list of quantiles for survival time predictions. Provides percentile-based survival time estimates. 
- credible_level
- Level for posterior credible intervals on all estimates. 0.95 provides 95\ - cross_validationPerform k-fold cross-validation to assess out-of-sample performance and model generalization. - cv_foldsNumber of folds for cross-validation. More folds provide better performance estimates but increase computation. - convergence_diagnosticsCompute MCMC convergence diagnostics including effective sample sizes, autocorrelation, and potential scale reduction. - posterior_predictionPerform posterior predictive checks to assess model adequacy by comparing observed data to posterior predictive distributions. - show_model_summaryDisplay comprehensive model summary including BART parameters, convergence diagnostics, and posterior summaries. - show_variable_importanceDisplay variable importance measures based on selection frequencies and splitting criteria across all trees. - show_survival_summaryDisplay survival function summaries including median survival times and survival probabilities at key time points. - show_convergenceDisplay MCMC convergence diagnostics and chain mixing assessment for model validation. - plot_survival_curvesPlot individual and population-level survival curves with posterior credible bands showing uncertainty. - plot_variable_importanceCreate variable importance plot showing inclusion probabilities and splitting frequencies across the ensemble. - plot_traceGenerate trace plots for key model parameters to assess MCMC convergence and mixing. - plot_posterior_predictiveCreate posterior predictive check plots comparing observed survival patterns to model predictions. - plot_partial_dependenceGenerate partial dependence plots showing marginal effects of individual variables on survival hazard. - plot_interactionsVisualize detected interactions between variables with strength and confidence measures. - probit_linkUse probit link for binary outcomes in cure models. Alternative to logit link for mixture component modeling. - cure_fraction_priorPrior distribution for cure fraction in cure models. Uniform is non-informative, Jeffreys is scale-invariant. - adaptive_scalingUse adaptive scaling of MCMC proposals to optimize acceptance rates and improve sampling efficiency. - parallel_chainsRun MCMC chains in parallel for faster computation when multiple chains are specified. - memory_optimizationUse memory-efficient storage for large ensembles and datasets to reduce memory footprint. - random_seedRandom seed for reproducible MCMC sampling and tree ensemble generation. 
A results object containing:
| results$instructions | a html | ||||
| results$todo | a html | ||||
| results$modelSummary | a table | ||||
| results$variableImportance | a table | ||||
| results$survivalSummary | a table | ||||
| results$convergenceDiagnostics | a table | ||||
| results$crossValidationResults | a table | ||||
| results$posteriorPredictive | a table | ||||
| results$interactionEffects | a table | ||||
| results$survivalCurvesPlot | an image | ||||
| results$variableImportancePlot | an image | ||||
| results$tracePlot | an image | ||||
| results$posteriorPredictivePlot | an image | ||||
| results$partialDependencePlot | an image | ||||
| results$interactionsPlot | an image | 
asDF or as.data.frame. For example:results$modelSummary$asDFas.data.frame(results$modelSummary)
Bayesian Additive Regression Trees (BART) for survival analysis providing
flexible nonparametric modeling with automatic variable selection and
interaction detection. BART combines an ensemble of weak learners (trees)
with Bayesian priors to create a powerful predictive model that naturally
handles nonlinear relationships, interactions, and variable selection
without requiring preprocessing. The method provides full posterior
distributions for predictions, built-in uncertainty quantification, and
robust performance across diverse survival scenarios. Particularly
effective for complex survival data with unknown functional forms,
high-dimensional predictors, mixed variable types, and scenarios requiring
individualized survival predictions with credible intervals. Implementation
supports both accelerated failure time and proportional hazards
formulations with comprehensive posterior inference and model diagnostics.
result <- survivalbart(
    data = mydata,
    time = "time_to_event",
    event = "event_indicator",
    predictors = c("age", "stage", "biomarker1", "biomarker2"),
    model_type = "aft",
    n_trees = 200,
    n_burn = 1000,
    n_post = 2000
)