Simple Tumor Marker and Cancer Progression Data
Source:R/joint_modeling_data.R
simple_cancer_data.RdA simplified longitudinal dataset containing tumor marker measurements and cancer progression for 100 cancer patients. Ideal for teaching and initial exploration of joint models.
Format
A data frame with 588 observations and 6 variables:
- patient_id
Character. Unique patient identifier (CA_001 to CA_100)
- age
Numeric. Patient age at baseline (years)
- treatment
Factor. Treatment group (Standard, Experimental)
- visit_time
Numeric. Time of tumor marker measurement (months from baseline)
- tumor_marker
Numeric. Tumor marker level (units/mL)
- survival_time
Numeric. Time to progression/death or last follow-up (months)
- progression_status
Numeric. Event indicator (0 = censored, 1 = progression/death)
Details
This simplified dataset is perfect for:
Learning joint modeling concepts
Quick algorithm testing
Demonstrating treatment effects
Smaller sample size for faster computation
Clear biomarker-survival relationship
Features:
Tumor marker levels generally increase over time
Experimental treatment slows marker increase
Higher marker levels increase progression hazard
10% event rate over 24 months follow-up
Examples
data(simple_cancer_data)
# Basic joint modeling analysis
library(ggplot2)
# Marker trajectories by treatment
ggplot(simple_cancer_data, aes(x = visit_time, y = tumor_marker, color = treatment)) +
geom_smooth(method = "loess") +
labs(title = "Tumor Marker by Treatment",
x = "Time (months)", y = "Tumor Marker")
# Individual trajectories for first 20 patients
first_20 <- subset(simple_cancer_data, patient_id %in% unique(patient_id)[1:20])
ggplot(first_20, aes(x = visit_time, y = tumor_marker, group = patient_id)) +
geom_line(alpha = 0.7) + geom_point(alpha = 0.5) +
facet_wrap(~treatment) +
labs(title = "Individual Tumor Marker Trajectories")