Diagnostic Meta-Analysis Test Datasets
Source:R/data_diagnosticmeta_docs.R
diagnosticmeta_test_datasets.RdComprehensive collection of test datasets for the diagnosticmeta function, covering various meta-analysis scenarios including bivariate analysis, HSROC, meta-regression, and publication bias assessment for diagnostic test accuracy studies.
Standard diagnostic test accuracy meta-analysis data with 20 studies, realistic sensitivity/specificity values, moderate heterogeneity, and continuous covariates for meta-regression.
Minimal diagnostic test accuracy meta-analysis data with only 5 studies, designed for testing edge cases with small sample sizes and convergence behavior.
Diagnostic test accuracy meta-analysis data with categorical covariate (imaging modality) for testing categorical meta-regression and subgroup analysis.
Diagnostic test accuracy meta-analysis data with intentional zero cells in several studies, designed for testing zero-cell correction methods (none, constant, treatment_arm, empirical).
Large diagnostic test accuracy meta-analysis data with 50 studies, designed for testing computational efficiency, scalability, and performance with large-scale meta-analyses.
Usage
diagnosticmeta_test
diagnosticmeta_test_small
diagnosticmeta_test_categorical
diagnosticmeta_test_zeros
diagnosticmeta_test_largeFormat
Various data frames with 2x2 contingency table data optimized for diagnostic meta-analysis
A data frame with 20 observations and 7 variables:
- study
Character. Unique study identifier (Study_1 to Study_20)
- true_positives
Numeric. Number of true positive results (1-100)
- false_positives
Numeric. Number of false positive results (1-120)
- false_negatives
Numeric. Number of false negative results (1-40)
- true_negatives
Numeric. Number of true negative results (1-140)
- year
Numeric. Publication year (2015-2024)
- quality_score
Numeric. Study quality score (1-10)
A data frame with 5 observations and 7 variables:
- study
Character. Unique study identifier (Study_1 to Study_5)
- true_positives
Numeric. Number of true positive results
- false_positives
Numeric. Number of false positive results
- false_negatives
Numeric. Number of false negative results
- true_negatives
Numeric. Number of true negative results
- year
Numeric. Publication year (2015-2024)
- quality_score
Numeric. Study quality score (1-10)
A data frame with 20 observations and 8 variables:
- study
Character. Unique study identifier (Study_1 to Study_20)
- true_positives
Numeric. Number of true positive results
- false_positives
Numeric. Number of false positive results
- false_negatives
Numeric. Number of false negative results
- true_negatives
Numeric. Number of true negative results
- year
Numeric. Publication year (2015-2024)
- quality_score
Numeric. Study quality score (1-10)
- imaging_modality
Character. Imaging type: "MRI", "CT", "Ultrasound"
A data frame with 20 observations and 7 variables:
- study
Character. Unique study identifier (Study_1 to Study_20)
- true_positives
Numeric. Number of true positive results
- false_positives
Numeric. Number of false positive results (zero in studies 1, 5, 10)
- false_negatives
Numeric. Number of false negative results (zero in studies 2, 7)
- true_negatives
Numeric. Number of true negative results
- year
Numeric. Publication year (2015-2024)
- quality_score
Numeric. Study quality score (1-10)
A data frame with 50 observations and 7 variables:
- study
Character. Unique study identifier (Study_1 to Study_50)
- true_positives
Numeric. Number of true positive results
- false_positives
Numeric. Number of false positive results
- false_negatives
Numeric. Number of false negative results
- true_negatives
Numeric. Number of true negative results
- year
Numeric. Publication year (2010-2024, extended range)
- quality_score
Numeric. Study quality score (1-10)
Details
This collection includes five specialized datasets designed to test different aspects of the diagnosticmeta function:
Standard bivariate meta-analysis with 20 studies
Small sample size testing (5 studies)
Categorical meta-regression (imaging modality)
Zero-cell correction methods
Large-scale meta-analysis (50 studies)
Continuous covariates (year, quality score)
All datasets represent realistic diagnostic accuracy data from meta-analyses of AI algorithms, biomarkers, or IHC markers in pathology.
This dataset simulates a meta-analysis of diagnostic test accuracy studies with:
Realistic sensitivity: 60-99% (mean ~85%)
Realistic specificity: 70-99% (mean ~90%)
Moderate between-study heterogeneity
Realistic sample size variation (50-240 per study)
Correlation: Higher quality studies have larger sample sizes
Minimal zero cells (2 studies) for realistic modeling
Typical use: AI algorithm validation, biomarker diagnostic accuracy, IHC marker validation in pathology.
This dataset tests:
Minimum study requirements for meta-analysis
Convergence with minimal data
Warning message generation for small samples
Robustness of estimation procedures
This dataset includes an additional categorical covariate for:
Categorical meta-regression testing
Subgroup analysis by imaging modality
Exploration of heterogeneity sources
Testing interaction between categorical and continuous covariates
This dataset includes intentional zero cells to test:
Model-based zero-cell handling (bivariate model, recommended)
Constant correction (+0.5 to all cells)
Treatment-arm correction (add only to zero cells)
Empirical correction (study-specific)
Zero cells are common in diagnostic meta-analysis when:
Studies have perfect sensitivity or specificity
Sample sizes are small
Test threshold is extreme
This dataset tests:
Computational efficiency with many studies
Scalability of bivariate models
Performance of meta-regression algorithms
Publication bias assessment with adequate power
Memory usage and optimization
Realistic scenario: Comprehensive meta-analysis of well-studied diagnostic tests (e.g., CA-125 for ovarian cancer, PSA for prostate cancer).
Examples
# Load the data
data(diagnosticmeta_test)
# View structure
str(diagnosticmeta_test)
#> tibble [20 × 7] (S3: tbl_df/tbl/data.frame)
#> $ study : chr [1:20] "Study_1" "Study_2" "Study_3" "Study_4" ...
#> $ true_positives : num [1:20] 81 56 84 23 36 44 19 75 65 58 ...
#> $ false_positives: num [1:20] 10 6 4 13 32 28 3 6 18 14 ...
#> $ false_negatives: num [1:20] 30 6 11 5 20 10 1 6 21 11 ...
#> $ true_negatives : num [1:20] 153 120 100 98 95 76 27 105 67 93 ...
#> $ year : int [1:20] 2015 2019 2015 2023 2024 2018 2016 2024 2015 2022 ...
#> $ quality_score : num [1:20] 5 7 7 8 7 3 3 9 7 4 ...
# Basic meta-analysis
result <- diagnosticmeta(
data = diagnosticmeta_test,
study = "study",
true_positives = "true_positives",
false_positives = "false_positives",
false_negatives = "false_negatives",
true_negatives = "true_negatives"
)
#> Error in diagnosticmeta(data = diagnosticmeta_test, study = "study", true_positives = "true_positives", false_positives = "false_positives", false_negatives = "false_negatives", true_negatives = "true_negatives"): argument "covariate" is missing, with no default
# Meta-regression with year as covariate
result_year <- diagnosticmeta(
data = diagnosticmeta_test,
study = "study",
true_positives = "true_positives",
false_positives = "false_positives",
false_negatives = "false_negatives",
true_negatives = "true_negatives",
covariate = "year",
meta_regression = TRUE
)
# Load the data
data(diagnosticmeta_test_small)
# Test with minimal studies
result_small <- diagnosticmeta(
data = diagnosticmeta_test_small,
study = "study",
true_positives = "true_positives",
false_positives = "false_positives",
false_negatives = "false_negatives",
true_negatives = "true_negatives"
)
#> Error in diagnosticmeta(data = diagnosticmeta_test_small, study = "study", true_positives = "true_positives", false_positives = "false_positives", false_negatives = "false_negatives", true_negatives = "true_negatives"): argument "covariate" is missing, with no default
# Load the data
data(diagnosticmeta_test_categorical)
# Meta-regression with categorical covariate
result_categorical <- diagnosticmeta(
data = diagnosticmeta_test_categorical,
study = "study",
true_positives = "true_positives",
false_positives = "false_positives",
false_negatives = "false_negatives",
true_negatives = "true_negatives",
covariate = "imaging_modality",
meta_regression = TRUE
)
# Load the data
data(diagnosticmeta_test_zeros)
# Test model-based approach (recommended)
result_none <- diagnosticmeta(
data = diagnosticmeta_test_zeros,
study = "study",
true_positives = "true_positives",
false_positives = "false_positives",
false_negatives = "false_negatives",
true_negatives = "true_negatives",
zero_cell_correction = "none"
)
#> Error in diagnosticmeta(data = diagnosticmeta_test_zeros, study = "study", true_positives = "true_positives", false_positives = "false_positives", false_negatives = "false_negatives", true_negatives = "true_negatives", zero_cell_correction = "none"): argument "covariate" is missing, with no default
# Test constant correction
result_constant <- diagnosticmeta(
data = diagnosticmeta_test_zeros,
study = "study",
true_positives = "true_positives",
false_positives = "false_positives",
false_negatives = "false_negatives",
true_negatives = "true_negatives",
zero_cell_correction = "constant"
)
#> Error in diagnosticmeta(data = diagnosticmeta_test_zeros, study = "study", true_positives = "true_positives", false_positives = "false_positives", false_negatives = "false_negatives", true_negatives = "true_negatives", zero_cell_correction = "constant"): argument "covariate" is missing, with no default
# Load the data
data(diagnosticmeta_test_large)
# Test performance with large dataset
system.time({
result_large <- diagnosticmeta(
data = diagnosticmeta_test_large,
study = "study",
true_positives = "true_positives",
false_positives = "false_positives",
false_negatives = "false_negatives",
true_negatives = "true_negatives",
bivariate_analysis = TRUE,
heterogeneity_analysis = TRUE
)
})
#> Error in diagnosticmeta(data = diagnosticmeta_test_large, study = "study", true_positives = "true_positives", false_positives = "false_positives", false_negatives = "false_negatives", true_negatives = "true_negatives", bivariate_analysis = TRUE, heterogeneity_analysis = TRUE): argument "covariate" is missing, with no default
#> Timing stopped at: 0.004 0.001 0.006