A dataset containing clinical measurements related to breast cancer cells, originally from the UCI Machine Learning Repository. It is often used for classification tasks.
Usage
data(BreastCancer)
Format
A data frame with 699 rows and 11 variables:
- Id
Numeric. Sample code number.
- Cl.thickness
Numeric. Clump thickness (1-10).
- Cell.size
Numeric. Uniformity of cell size (1-10).
- Cell.shape
Numeric. Uniformity of cell shape (1-10).
- Marg.adhesion
Numeric. Marginal adhesion (1-10).
- Epith.c.size
Numeric. Single epithelial cell size (1-10).
- Bare.nuclei
Numeric. Bare nuclei (1-10). Note: contains NAs in original dataset, may be preprocessed here.
- Bl.cromatin
Numeric. Bland chromatin (1-10).
- Normal.nucleoli
Numeric. Normal nucleoli (1-10).
- Mitoses
Numeric. Mitoses (1-10).
- Class
Character. Class of the tumor ("benign" or "malignant").
Examples
data(BreastCancer)
str(BreastCancer)
#> spc_tbl_ [699 × 11] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
#> $ Id : num [1:699] 1000025 1002945 1015425 1016277 1017023 ...
#> $ Cl.thickness : num [1:699] 5 5 3 6 4 8 1 2 2 4 ...
#> $ Cell.size : num [1:699] 1 4 1 8 1 10 1 1 1 2 ...
#> $ Cell.shape : num [1:699] 1 4 1 8 1 10 1 2 1 1 ...
#> $ Marg.adhesion : num [1:699] 1 5 1 1 3 8 1 1 1 1 ...
#> $ Epith.c.size : num [1:699] 2 7 2 3 2 7 2 2 2 2 ...
#> $ Bare.nuclei : num [1:699] 1 10 2 4 1 10 10 1 1 1 ...
#> $ Bl.cromatin : num [1:699] 3 3 3 3 3 9 3 3 1 2 ...
#> $ Normal.nucleoli: num [1:699] 1 2 1 7 1 7 1 1 1 1 ...
#> $ Mitoses : num [1:699] 1 1 1 1 1 1 1 1 5 1 ...
#> $ Class : chr [1:699] "benign" "benign" "benign" "benign" ...
#> - attr(*, "spec")=List of 3
#> ..$ cols :List of 11
#> .. ..$ Id : list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_double" "collector"
#> .. ..$ Cl.thickness : list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_double" "collector"
#> .. ..$ Cell.size : list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_double" "collector"
#> .. ..$ Cell.shape : list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_double" "collector"
#> .. ..$ Marg.adhesion : list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_double" "collector"
#> .. ..$ Epith.c.size : list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_double" "collector"
#> .. ..$ Bare.nuclei : list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_double" "collector"
#> .. ..$ Bl.cromatin : list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_double" "collector"
#> .. ..$ Normal.nucleoli: list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_double" "collector"
#> .. ..$ Mitoses : list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_double" "collector"
#> .. ..$ Class : list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_character" "collector"
#> ..$ default: list()
#> .. ..- attr(*, "class")= chr [1:2] "collector_guess" "collector"
#> ..$ skip : num 1
#> ..- attr(*, "class")= chr "col_spec"
head(BreastCancer)
#> # A tibble: 6 × 11
#> Id Cl.thickness Cell.size Cell.shape Marg.adhesion Epith.c.size
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1000025 5 1 1 1 2
#> 2 1002945 5 4 4 5 7
#> 3 1015425 3 1 1 1 2
#> 4 1016277 6 8 8 1 3
#> 5 1017023 4 1 1 3 2
#> 6 1017122 8 10 10 8 7
#> # ℹ 5 more variables: Bare.nuclei <dbl>, Bl.cromatin <dbl>,
#> # Normal.nucleoli <dbl>, Mitoses <dbl>, Class <chr>
summary(BreastCancer$Cl.thickness)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 1.000 2.000 4.000 4.418 6.000 10.000
table(BreastCancer$Class)
#>
#> benign malignant
#> 458 241