Skip to contents

A dataset for testing and developing predictive models, particularly for cardiovascular events. It contains patient demographics, clinical risk factors, lab values, and an outcome variable.

Usage

data(modelbuilder_test_data)

Format

A data frame with 600 rows and 16 variables:

patient_id

Character. Unique patient identifier.

hospital

Character. Hospital or study center identifier.

age

Integer. Patient's age in years.

sex

Character. Patient's sex (e.g., "Male", "Female").

diabetes

Character. Diabetes status (e.g., "Yes", "No").

hypertension

Character. Hypertension status (e.g., "Yes", "No").

smoking

Character. Smoking status (e.g., "Yes", "No", "Former").

cholesterol

Integer. Total cholesterol level.

bmi

Numeric. Body Mass Index.

systolic_bp

Integer. Systolic blood pressure.

family_history

Character. Family history of cardiovascular disease (e.g., "Yes", "No").

troponin

Numeric. Cardiac troponin level.

creatinine

Numeric. Serum creatinine level.

cardiovascular_event

Character. Outcome variable indicating if a cardiovascular event occurred (e.g., "Yes", "No").

true_risk

Numeric. A simulated true underlying risk score for the patient.

risk_category

Character. A pre-calculated risk category based on certain criteria.

Examples

data(modelbuilder_test_data)
str(modelbuilder_test_data)
#> 'data.frame':	600 obs. of  16 variables:
#>  $ patient_id          : chr  "PT0001" "PT0002" "PT0003" "PT0004" ...
#>  $ hospital            : chr  "University Medical Center" "General Hospital" "Community Hospital" "Community Hospital" ...
#>  $ age                 : num  81 58 69 73 70 64 83 64 85 64 ...
#>  $ sex                 : chr  "Male" "Male" "Female" "Male" ...
#>  $ diabetes            : chr  "Yes" "No" "No" "Yes" ...
#>  $ hypertension        : chr  "No" "No" "No" "Yes" ...
#>  $ smoking             : chr  "Never" "Current" "Never" "Former" ...
#>  $ cholesterol         : num  167 194 185 186 129 205 243 250 283 219 ...
#>  $ bmi                 : num  20.2 28.2 21.3 25.1 21.8 34.3 29.6 28.1 30.6 32.2 ...
#>  $ systolic_bp         : num  132 121 134 131 130 170 107 148 132 130 ...
#>  $ family_history      : chr  "No" "Yes" "No" "Yes" ...
#>  $ troponin            : num  7.36 2.32 3.43 2.41 NA 3.66 3.12 2.68 2.5 3.39 ...
#>  $ creatinine          : num  1.19 1.19 1.09 1.29 1.15 0.93 1.32 0.9 1.07 0.96 ...
#>  $ cardiovascular_event: Factor w/ 2 levels "No","Yes": 1 2 1 2 NA 2 2 2 2 1 ...
#>  $ true_risk           : num  0.823 0.901 0.212 0.976 NA 0.875 0.764 0.792 0.987 0.811 ...
#>  $ risk_category       : Factor w/ 4 levels "Low","Moderate",..: 4 4 3 4 NA 4 4 4 4 4 ...
head(modelbuilder_test_data)
#>   patient_id                  hospital age    sex diabetes hypertension smoking
#> 1     PT0001 University Medical Center  81   Male      Yes           No   Never
#> 2     PT0002          General Hospital  58   Male       No           No Current
#> 3     PT0003        Community Hospital  69 Female       No           No   Never
#> 4     PT0004        Community Hospital  73   Male      Yes          Yes  Former
#> 5     PT0005 University Medical Center  70 Female       No          Yes   Never
#> 6     PT0006        Community Hospital  64 Female       No          Yes  Former
#>   cholesterol  bmi systolic_bp family_history troponin creatinine
#> 1         167 20.2         132             No     7.36       1.19
#> 2         194 28.2         121            Yes     2.32       1.19
#> 3         185 21.3         134             No     3.43       1.09
#> 4         186 25.1         131            Yes     2.41       1.29
#> 5         129 21.8         130             No       NA       1.15
#> 6         205 34.3         170            Yes     3.66       0.93
#>   cardiovascular_event true_risk risk_category
#> 1                   No     0.823     Very High
#> 2                  Yes     0.901     Very High
#> 3                   No     0.212          High
#> 4                  Yes     0.976     Very High
#> 5                 <NA>        NA          <NA>
#> 6                  Yes     0.875     Very High
summary(modelbuilder_test_data$bmi)
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#>   18.00   23.80   26.70   26.72   29.40   40.20 
table(modelbuilder_test_data$cardiovascular_event)
#> 
#>  No Yes 
#> 169 358