Diagnostic Meta-Analysis Test Datasets — diagnosticmeta_test

Comprehensive collection of test datasets for the diagnosticmeta function, covering various meta-analysis scenarios including bivariate analysis, HSROC, meta-regression, and publication bias assessment for diagnostic test accuracy studies.

Standard diagnostic test accuracy meta-analysis data with 20 studies, realistic sensitivity/specificity values, moderate heterogeneity, and continuous covariates for meta-regression.

Minimal diagnostic test accuracy meta-analysis data with only 5 studies, designed for testing edge cases with small sample sizes and convergence behavior.

Diagnostic test accuracy meta-analysis data with categorical covariate (imaging modality) for testing categorical meta-regression and subgroup analysis.

Diagnostic test accuracy meta-analysis data with intentional zero cells in several studies, designed for testing zero-cell correction methods (none, constant, treatment_arm, empirical).

Large diagnostic test accuracy meta-analysis data with 50 studies, designed for testing computational efficiency, scalability, and performance with large-scale meta-analyses.

Usage

diagnosticmeta_test

diagnosticmeta_test_small

diagnosticmeta_test_categorical

diagnosticmeta_test_zeros

diagnosticmeta_test_large

Format

Various data frames with 2x2 contingency table data optimized for diagnostic meta-analysis

A data frame with 20 observations and 7 variables:

study: Character. Unique study identifier (Study_1 to Study_20)
true_positives: Numeric. Number of true positive results (1-100)
false_positives: Numeric. Number of false positive results (1-120)
false_negatives: Numeric. Number of false negative results (1-40)
true_negatives: Numeric. Number of true negative results (1-140)
year: Numeric. Publication year (2015-2024)
quality_score: Numeric. Study quality score (1-10)

A data frame with 5 observations and 7 variables:

study: Character. Unique study identifier (Study_1 to Study_5)
true_positives: Numeric. Number of true positive results
false_positives: Numeric. Number of false positive results
false_negatives: Numeric. Number of false negative results
true_negatives: Numeric. Number of true negative results
year: Numeric. Publication year (2015-2024)
quality_score: Numeric. Study quality score (1-10)

A data frame with 20 observations and 8 variables:

study: Character. Unique study identifier (Study_1 to Study_20)
true_positives: Numeric. Number of true positive results
false_positives: Numeric. Number of false positive results
false_negatives: Numeric. Number of false negative results
true_negatives: Numeric. Number of true negative results
year: Numeric. Publication year (2015-2024)
quality_score: Numeric. Study quality score (1-10)
imaging_modality: Character. Imaging type: "MRI", "CT", "Ultrasound"

A data frame with 20 observations and 7 variables:

study: Character. Unique study identifier (Study_1 to Study_20)
true_positives: Numeric. Number of true positive results
false_positives: Numeric. Number of false positive results (zero in studies 1, 5, 10)
false_negatives: Numeric. Number of false negative results (zero in studies 2, 7)
true_negatives: Numeric. Number of true negative results
year: Numeric. Publication year (2015-2024)
quality_score: Numeric. Study quality score (1-10)

A data frame with 50 observations and 7 variables:

study: Character. Unique study identifier (Study_1 to Study_50)
true_positives: Numeric. Number of true positive results
false_positives: Numeric. Number of false positive results
false_negatives: Numeric. Number of false negative results
true_negatives: Numeric. Number of true negative results
year: Numeric. Publication year (2010-2024, extended range)
quality_score: Numeric. Study quality score (1-10)

Source

Generated by ClinicoPath development team for comprehensive diagnostic meta-analysis testing

Details

This collection includes five specialized datasets designed to test different aspects of the diagnosticmeta function:

Standard bivariate meta-analysis with 20 studies
Small sample size testing (5 studies)
Categorical meta-regression (imaging modality)
Zero-cell correction methods
Large-scale meta-analysis (50 studies)
Continuous covariates (year, quality score)

All datasets represent realistic diagnostic accuracy data from meta-analyses of AI algorithms, biomarkers, or IHC markers in pathology.

This dataset simulates a meta-analysis of diagnostic test accuracy studies with:

Realistic sensitivity: 60-99% (mean ~85%)
Realistic specificity: 70-99% (mean ~90%)
Moderate between-study heterogeneity
Realistic sample size variation (50-240 per study)
Correlation: Higher quality studies have larger sample sizes
Minimal zero cells (2 studies) for realistic modeling

Typical use: AI algorithm validation, biomarker diagnostic accuracy, IHC marker validation in pathology.

This dataset tests:

Minimum study requirements for meta-analysis
Convergence with minimal data
Warning message generation for small samples
Robustness of estimation procedures

This dataset includes an additional categorical covariate for:

Categorical meta-regression testing
Subgroup analysis by imaging modality
Exploration of heterogeneity sources
Testing interaction between categorical and continuous covariates

This dataset includes intentional zero cells to test:

Model-based zero-cell handling (bivariate model, recommended)
Constant correction (+0.5 to all cells)
Treatment-arm correction (add only to zero cells)
Empirical correction (study-specific)

Zero cells are common in diagnostic meta-analysis when:

Studies have perfect sensitivity or specificity
Sample sizes are small
Test threshold is extreme

This dataset tests:

Computational efficiency with many studies
Scalability of bivariate models
Performance of meta-regression algorithms
Publication bias assessment with adequate power
Memory usage and optimization

Realistic scenario: Comprehensive meta-analysis of well-studied diagnostic tests (e.g., CA-125 for ovarian cancer, PSA for prostate cancer).

Examples

# Load the data
data(diagnosticmeta_test)

# View structure
str(diagnosticmeta_test)
#> tibble [20 × 7] (S3: tbl_df/tbl/data.frame)
#>  $ study          : chr [1:20] "Study_1" "Study_2" "Study_3" "Study_4" ...
#>  $ true_positives : num [1:20] 81 56 84 23 36 44 19 75 65 58 ...
#>  $ false_positives: num [1:20] 10 6 4 13 32 28 3 6 18 14 ...
#>  $ false_negatives: num [1:20] 30 6 11 5 20 10 1 6 21 11 ...
#>  $ true_negatives : num [1:20] 153 120 100 98 95 76 27 105 67 93 ...
#>  $ year           : int [1:20] 2015 2019 2015 2023 2024 2018 2016 2024 2015 2022 ...
#>  $ quality_score  : num [1:20] 5 7 7 8 7 3 3 9 7 4 ...

# Basic meta-analysis
result <- diagnosticmeta(
  data = diagnosticmeta_test,
  study = "study",
  true_positives = "true_positives",
  false_positives = "false_positives",
  false_negatives = "false_negatives",
  true_negatives = "true_negatives",
  covariate = NULL
)

# Meta-regression with year as covariate
result_year <- diagnosticmeta(
  data = diagnosticmeta_test,
  study = "study",
  true_positives = "true_positives",
  false_positives = "false_positives",
  false_negatives = "false_negatives",
  true_negatives = "true_negatives",
  covariate = "year",
  meta_regression = TRUE
)
# Load the data
data(diagnosticmeta_test_small)

# Test with minimal studies
result_small <- diagnosticmeta(
  data = diagnosticmeta_test_small,
  study = "study",
  true_positives = "true_positives",
  false_positives = "false_positives",
  false_negatives = "false_negatives",
  true_negatives = "true_negatives",
  covariate = NULL
)
# Load the data
data(diagnosticmeta_test_categorical)

# Meta-regression with categorical covariate
result_categorical <- diagnosticmeta(
  data = diagnosticmeta_test_categorical,
  study = "study",
  true_positives = "true_positives",
  false_positives = "false_positives",
  false_negatives = "false_negatives",
  true_negatives = "true_negatives",
  covariate = "imaging_modality",
  meta_regression = TRUE
)
# Load the data
data(diagnosticmeta_test_zeros)

# Test model-based approach (recommended)
result_none <- diagnosticmeta(
  data = diagnosticmeta_test_zeros,
  study = "study",
  true_positives = "true_positives",
  false_positives = "false_positives",
  false_negatives = "false_negatives",
  true_negatives = "true_negatives",
  covariate = NULL,
  zero_cell_correction = "none"
)

# Test constant correction
result_constant <- diagnosticmeta(
  data = diagnosticmeta_test_zeros,
  study = "study",
  true_positives = "true_positives",
  false_positives = "false_positives",
  false_negatives = "false_negatives",
  true_negatives = "true_negatives",
  covariate = NULL,
  zero_cell_correction = "constant"
)
#> Warning: Some of the values of TP,FN,FP or TN do have non zero decimal places. Did you forget to round?
# Load the data
data(diagnosticmeta_test_large)

# View structure
str(diagnosticmeta_test_large)
#> tibble [50 × 7] (S3: tbl_df/tbl/data.frame)
#>  $ study          : chr [1:50] "Study_1" "Study_2" "Study_3" "Study_4" ...
#>  $ true_positives : num [1:50] 114 92 117 57 39 72 18 102 99 53 ...
#>  $ false_positives: num [1:50] 4 6 14 16 3 5 18 5 10 6 ...
#>  $ false_negatives: num [1:50] 46 25 18 17 22 31 2 28 9 20 ...
#>  $ true_negatives : num [1:50] 65 24 109 192 143 46 111 122 154 80 ...
#>  $ year           : int [1:50] 2021 2017 2013 2014 2011 2014 2019 2017 2017 2022 ...
#>  $ quality_score  : num [1:50] 8 9 7 7 8 7 7 7 8 7 ...

if (FALSE) { # \dontrun{
# Test performance with large dataset
system.time({
  result_large <- diagnosticmeta(
    data = diagnosticmeta_test_large,
    study = "study",
    true_positives = "true_positives",
    false_positives = "false_positives",
    false_negatives = "false_negatives",
    true_negatives = "true_negatives",
    covariate = NULL,
    bivariate_analysis = TRUE,
    heterogeneity_analysis = TRUE
  )
})
} # }