Skip to contents

Simulated dataset for a prospective cohort study (non-randomized observational design) with multiple exclusion stages. Designed for testing the studydiagram function and demonstrating participant flow in observational research.

Usage

observational_study_flow_data

Format

A data frame with 350 rows and 8 columns:

participant_id

Unique participant identifier (OBS-0001 to OBS-0350)

enrollment_date

Date of enrollment (2023-01-01 onwards)

age

Age in years (mean=58, sd=15)

diagnosis

Disease stage (Stage I/II/III/IV)

initial_exclusion

Initial screening exclusion reasons

consent_status

Consent/enrollment exclusion reasons

followup_status

Follow-up completion status and reasons for loss

in_final_analysis

Logical flag indicating inclusion in final analysis

study_group

Observational cohort assignment (Surgery/Radiation/Combination)

Source

Generated using data-raw/create_clinical_trial_flow_data.R (seed: 20251005)

Details

This dataset simulates a realistic observational cohort study with:

  • 350 participants initially enrolled

  • 15\

  • 8\

  • 10\

  • Final retention: 70.6\

Unlike the RCT dataset (clinical_trial_consort_data), this represents observational allocation to treatment groups (not randomized).

Usage

This dataset demonstrates:

  • Non-randomized study flow visualization

  • Observational cohort allocation

  • Temporal enrollment patterns (enrollment_date)

  • Multiple exclusion pathways without randomization

Examples

if (FALSE) { # \dontrun{
# Load data
data(observational_study_flow_data)

# View structure
str(observational_study_flow_data)

# Enrollment over time
table(format(observational_study_flow_data$enrollment_date, "%Y-%m"))

# Final analysis by study group
table(observational_study_flow_data$study_group,
      observational_study_flow_data$in_final_analysis)
} # }