Skip to contents

A simulated dataset containing patient demographics and levels of various cancer biomarkers, along with cancer status and stage. Useful for evaluating diagnostic or prognostic performance of biomarkers.

Usage

data(cancer_biomarker_data)

Format

A data frame with 500 rows and 11 variables:

patient_id

Integer. Unique patient identifier.

age

Integer. Patient's age in years.

age_group

Character. Age group of the patient (e.g., "<50", "50-70", ">70").

sex

Character. Sex of the patient (e.g., "Male", "Female").

ca125

Numeric. Level of Cancer Antigen 125 biomarker.

he4

Numeric. Level of Human Epididymis Protein 4 biomarker.

cea

Numeric. Level of Carcinoembryonic Antigen biomarker.

ca199

Numeric. Level of Carbohydrate Antigen 19-9 biomarker.

roma_score

Numeric. Risk of Ovarian Malignancy Algorithm score.

cancer_status

Character. Diagnosis of cancer (e.g., "Cancer", "Benign").

stage

Character. Cancer stage if applicable (e.g., "I", "II", "III", "IV", or "Benign").

Examples

data(cancer_biomarker_data)
str(cancer_biomarker_data)
#> 'data.frame':	500 obs. of  11 variables:
#>  $ patient_id   : int  1 2 3 4 5 6 7 8 9 10 ...
#>  $ age          : int  63 34 30 54 65 57 67 60 60 75 ...
#>  $ age_group    : chr  "Older" "Young" "Young" "Middle" ...
#>  $ sex          : chr  "Female" "Female" "Male" "Male" ...
#>  $ ca125        : num  106.2 152 40 157.6 20.8 ...
#>  $ he4          : num  99.6 82.5 88.3 188.7 62.9 ...
#>  $ cea          : num  2.59 11.89 4.75 11.67 2.91 ...
#>  $ ca199        : num  24.7 21.1 32.4 86.4 54.8 65.4 30.3 44.5 21 32.9 ...
#>  $ roma_score   : num  60.6 79.2 40.6 91.2 28.7 25.5 93.5 17.9 26.1 91.8 ...
#>  $ cancer_status: chr  "Cancer" "Cancer" "No Cancer" "Cancer" ...
#>  $ stage        : chr  "Early" "Late" "N/A" "Late" ...
head(cancer_biomarker_data)
#>   patient_id age age_group    sex ca125   he4   cea ca199 roma_score
#> 1          1  63     Older Female 106.2  99.6  2.59  24.7       60.6
#> 2          2  34     Young Female 152.0  82.5 11.89  21.1       79.2
#> 3          3  30     Young   Male  40.0  88.3  4.75  32.4       40.6
#> 4          4  54    Middle   Male 157.6 188.7 11.67  86.4       91.2
#> 5          5  65     Older   Male  20.8  62.9  2.91  54.8       28.7
#> 6          6  57     Older Female  20.7  47.0  2.21  65.4       25.5
#>   cancer_status stage
#> 1        Cancer Early
#> 2        Cancer  Late
#> 3     No Cancer   N/A
#> 4        Cancer  Late
#> 5     No Cancer   N/A
#> 6     No Cancer   N/A
summary(cancer_biomarker_data$ca125)
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
#>    0.00   13.80   27.70   58.71  109.12  248.40      10 
table(cancer_biomarker_data$cancer_status)
#> 
#>    Cancer No Cancer 
#>       148       352