Format
rpasurvival_test: Standard dataset with 200 observations and 11 variables:
- patient_id
Character. Patient identifier (PT001-PT200)
- time
Numeric. Survival time in months (range: 0.5-120, mean ~36)
- event
Factor. Event indicator (0 = censored, 1 = death/event). Event rate ~65\ ageNumeric. Patient age in years (40-85, mean ~65) stageOrdered factor. Tumor stage (I, II, III, IV) gradeOrdered factor. Tumor grade (G1, G2, G3) LVIFactor. Lymphovascular invasion (Absent, Present) tumor_sizeNumeric. Tumor size in centimeters (0.5-10) ki67Numeric. Ki-67 proliferation index, percentage (0-100). ~3\ performance_statusOrdered factor. ECOG performance status (0, 1, 2) treatmentFactor. Treatment modality (Surgery only, Surgery + Chemo, Surgery + Radio, Trimodal)
rpasurvival_small: Minimal dataset with 50 observations and 6 variables:
- patient_id
Character. Patient identifier (SM01-SM50)
- time
Numeric. Survival time in months
- event
Factor. Event indicator (0, 1)
- age
Numeric. Patient age in years
- stage
Factor. Tumor stage (Early, Advanced)
- grade
Factor. Tumor grade (Low, High)
rpasurvival_large: Large dataset with 500 observations and 11 variables:
- patient_id
Character. Patient identifier (LG0001-LG0500)
- time
Numeric. Survival time in months
- event
Factor. Event indicator (0, 1). Event rate ~70\ ageNumeric. Patient age in years stageOrdered factor. Detailed tumor stage (IA, IB, IIA, IIB, IIIA, IIIB, IV) gradeOrdered factor. Tumor grade (1, 2, 3) LVIFactor. Lymphovascular invasion (No, Yes) PNIFactor. Perineural invasion (No, Yes) tumor_sizeNumeric. Tumor size in centimeters nodes_positiveNumeric. Number of positive lymph nodes biomarker1Numeric. Continuous biomarker 1 biomarker2Numeric. Continuous biomarker 2
Edge case datasets (for testing different event/time coding):
Generated synthetically using
data-raw/rpasurvival_test_data.R. Seed: 12345. Generation date: 2026-01-31. Synthetic datasets for testing and demonstrating therpasurvivalfunction (Recursive Partitioning Analysis for Survival Data). These datasets were generated using a seeded random number generator to produce realistic survival data with the following characteristics:Survival times follow exponential distribution
Event rates are clinically realistic (60-70\
Prognostic correlations built in (Stage IV → shorter survival)
Missing data pattern (~3\
Events-per-variable (EPV) ratio > 10 for all datasets
Non-negative survival times
Proper factor level ordering (ordinal variables)
Realistic clinical distributions
Sufficient sample sizes for RPA analysis
RDA: Native R format (use
data())CSV: Comma-separated values
XLSX: Excel format
OMV: jamovi native format
Usage ExamplesSee
vignette("rpasurvival-examples")for comprehensive examples.Basic usage:data(rpasurvival_test) library(ClinicoPath)# Standard RPA analysis rpasurvival( data = rpasurvival_test, time = "time", event = "event", predictors = c("age", "stage", "grade", "LVI"), time_unit = "months" )# Test small sample warnings data(rpasurvival_small) rpasurvival( data = rpasurvival_small, time = "time", event = "event", predictors = c("stage", "grade") )# Test different event coding data(rpasurvival_edge_truefalse) rpasurvival( data = rpasurvival_edge_truefalse, time = "time", event = "event_tf", predictors = c("stage", "grade"), eventValue = "TRUE" )Testing ScenariosThe datasets support testing of:
Standard analysis: Use
rpasurvival_testwith 4-6 predictorsSmall samples: Use
rpasurvival_small, expect warningsComplex trees: Use
rpasurvival_largewith maxdepth=5Event coding: Test TRUE/FALSE and 1/2 coding schemes
Time units: Test days, months, years with time_unit parameter
Missing data: Verify handling of ~3\
Mixed predictors: Continuous, ordinal, and nominal variables
ValidationAll datasets have been validated for:
Non-negative survival times
Appropriate event rates
Stage-survival correlation (higher stage → worse prognosis)
Sufficient EPV (events per variable > 10)
Realistic clinical distributions
Proper factor level ordering
# Load standard test data data(rpasurvival_test)# Examine structure str(rpasurvival_test)# Summary statistics summary(rpasurvival_test)# Check event rate table(rpasurvival_test$event) prop.table(table(rpasurvival_test$event))# Check stage distribution table(rpasurvival_test$stage)# Basic RPA analysis Liu Y, et al. (2026). Recursive partitioning analysis for survival data.
rpasurvivalfor the main analysis functionvignette("rpasurvival-examples")for comprehensive usage examples datasets