Skip to contents

Comprehensive test datasets designed to demonstrate and test all features of the Advanced Raincloud Plot module. These datasets simulate realistic clinical trial scenarios with various outcome measures and missing data patterns.

Usage

advancedraincloud_data

advancedraincloud_baseline

advancedraincloud_endpoint

advancedraincloud_change

Format

advancedraincloud_data

Longitudinal dataset with 900 rows (300 patients × 3 visits) and 15 variables

advancedraincloud_baseline

Cross-sectional baseline data with 300 rows and 13 variables

advancedraincloud_endpoint

Cross-sectional endpoint data with 300 rows and 13 variables

advancedraincloud_change

Change score data with 130 complete cases and 6 variables

An object of class tbl_df (inherits from tbl, data.frame) with 900 rows and 15 columns.

An object of class tbl_df (inherits from tbl, data.frame) with 300 rows and 13 columns.

An object of class tbl_df (inherits from tbl, data.frame) with 300 rows and 13 columns.

An object of class tbl_df (inherits from tbl, data.frame) with 130 rows and 6 columns.

Source

Generated using the script in data-raw/create_advancedraincloud_testdata.R

Variables

Patient Identifiers:

patient_id

Unique patient identifier (PT001-PT300)

treatment_arm

Treatment group - factor with levels "Placebo", "Drug A"

time_point

Visit timepoint - factor with levels "Baseline", "Week 4", "Week 12"

visit_number

Numeric visit number (1-3)

Demographics:

age

Patient age in years (normally distributed, mean=55, sd=12)

gender

Patient gender - factor with levels "Female", "Male"

disease_stage

Disease stage - factor with levels "Early", "Advanced"

biomarker_status

Biomarker status - factor with levels "Negative", "Positive"

age_group

Age stratification - factor "< 65 years", "≥ 65 years"

baseline_biomarker_high

Baseline biomarker level - factor "High", "Normal"

Primary Outcomes:

tumor_size_change

Percent change in tumor size from baseline (negative = shrinkage)

biomarker_level

Biomarker concentration (log-normally distributed, ~90 units baseline)

qol_score

Quality of life score (0-100 scale, higher = better)

pain_score

Pain intensity score - ordered factor (0-10 scale, higher = worse pain)

Derived Variables:

tumor_responder

Response classification - factor with levels:

  • "Progressive Disease" (>10% increase)

  • "Stable Disease" (-10% to +10%)

  • "Partial Response" (-30% to -10%)

  • "Complete Response" (≤-30%)

Change Score Variables (advancedraincloud_change only):

tumor_change

Change in tumor size (Week 12 - Baseline)

biomarker_change

Change in biomarker (Baseline - Week 12, positive = reduction)

qol_change

Change in quality of life (Week 12 - Baseline)

pain_change

Change in pain score (Baseline - Week 12, positive = improvement)

Clinical Trial Design

This simulates a randomized, placebo-controlled trial with:

  • N = 300 patients (150 per arm)

  • 3 timepoints: Baseline, Week 4, Week 12

  • Primary endpoint: Tumor size reduction

  • Secondary endpoints: Biomarker levels, quality of life, pain scores

  • Realistic dropout: ~15% overall, higher in placebo group

  • Missing data: Increases over time, varies by treatment

Treatment Effects

The simulated treatment effects are:

Placebo

Slight tumor progression (+5% at Week 4, +8% at Week 12)

Drug A

Tumor regression (-15% at Week 4, -25% at Week 12) with 30% non-responders

Biomarkers

Drug A reduces levels by ~40%, Placebo shows slight increase

Quality of Life

Drug A improves scores (+8 points), Placebo slight decline (-2)

Pain Scores

Drug A reduces pain (-1.5 points), Placebo slight increase (+0.5)

Testing Features

These datasets are designed to test all Advanced Raincloud Plot features:

Clinical Significance:

  • Clinical cutoffs (e.g., tumor_size_change = -30 for response)

  • Reference ranges (e.g., biomarker_level: 10-50 normal range)

  • MCID values (e.g., qol_score MCID = 10 points)

Effect Size Analysis:

  • Between-group comparisons (Drug A vs Placebo)

  • Cohen's d, Hedges' g, Glass's delta calculations

  • Multiple timepoint comparisons

Change Score Analysis:

  • Longitudinal change from baseline

  • Responder classifications (20% threshold default)

  • Missing data handling

Biomarker Features:

  • Log-normal distribution requiring transformation

  • Outliers (10 extreme values) for outlier handling tests

  • CV bands for assay variability assessment

Publication Features:

  • Sample size annotations

  • Missing data reporting

  • Statistical comparisons

  • Journal-specific formatting

Usage Examples


# Basic raincloud plot with clinical cutoff
advancedraincloud(
  data = advancedraincloud_baseline,
  y_var = "biomarker_level",
  x_var = "treatment_arm",
  clinical_cutoff = 100,
  show_sample_size = TRUE
)

# Longitudinal analysis with change scores
advancedraincloud(
  data = advancedraincloud_data,
  y_var = "tumor_size_change",
  x_var = "time_point",
  id_var = "patient_id",
  fill_var = "treatment_arm",
  show_longitudinal = TRUE,
  show_change_scores = TRUE,
  baseline_group = "Baseline",
  clinical_cutoff = -30,
  show_effect_size = TRUE
)

# Biomarker analysis with log transformation
advancedraincloud(
  data = advancedraincloud_endpoint,
  y_var = "biomarker_level",
  x_var = "treatment_arm",
  log_transform = TRUE,
  outlier_method = "winsorize",
  reference_range_min = 10,
  reference_range_max = 50,
  show_cv_bands = TRUE,
  journal_style = "nature"
)

# Quality of life with MCID analysis
advancedraincloud(
  data = advancedraincloud_data,
  y_var = "qol_score",
  x_var = "time_point",
  fill_var = "treatment_arm",
  show_mcid = TRUE,
  mcid_value = 10,
  show_change_scores = TRUE,
  generate_report = TRUE
)

# Likert scale analysis for pain scores
advancedraincloud(
  data = advancedraincloud_data,
  y_var = "pain_score",
  x_var = "treatment_arm",
  likert_mode = TRUE,
  show_comparisons = TRUE,
  p_value_position = "above"
)

Data Generation

These datasets were generated using realistic assumptions:

  • Reproducible random seed (42) for consistent results

  • Clinically plausible effect sizes and variability

  • Realistic missing data patterns typical of clinical trials

  • Appropriate correlation structures for repeated measures

  • Standard clinical trial demographic distributions

See also

  • advancedraincloud for the analysis function

  • vignette("advancedraincloud_examples") for detailed examples

Examples

# Load the datasets
data("advancedraincloud_data")
data("advancedraincloud_baseline") 
data("advancedraincloud_endpoint")
data("advancedraincloud_change")

# Explore the structure
str(advancedraincloud_data)
summary(advancedraincloud_baseline)

# Check missing data patterns
table(is.na(advancedraincloud_data$tumor_size_change), 
      advancedraincloud_data$time_point,
      advancedraincloud_data$treatment_arm)

# View response rates by treatment
with(advancedraincloud_endpoint, 
     table(tumor_responder, treatment_arm, useNA = "ifany"))

# Check change score completeness
nrow(advancedraincloud_change)  # Complete cases for change analysis