Missing Data Analysis and Multiple Imputation
Source:R/missingdata.b.R
, R/modelbuilder.b.R
grapes-or-or-grapes.Rd
Comprehensive missing data analysis and multiple imputation using mice and ggmice packages. This function provides a complete workflow for analyzing missing data patterns, performing multiple imputation by chained equations (MICE), and evaluating imputation quality. Designed specifically for clinical research applications where missing data is common and proper handling is critical for valid statistical inference.
Comprehensive clinical prediction model builder with advanced validation and performance assessment. Creates multiple logistic regression models optimized for integration with Decision Curve Analysis. Provides robust error handling, comprehensive validation, and clinical interpretation guidance.
Details
The missing data analysis function provides three main analysis types:
Pattern Analysis: Explores missing data structure and patterns
Multiple Imputation: Performs MICE imputation with convergence diagnostics
Complete Analysis: Combines pattern analysis and imputation
Key features include:
Visual and tabular missing data pattern analysis
Multiple imputation methods (PMM, Bayesian regression, logistic regression)
Convergence diagnostics with trace plots
Quality evaluation comparing observed vs imputed data
Flexible parameter customization
Clinical research focused interpretations
Common clinical applications:
Data quality assessment for clinical trials
Missing data handling in observational studies
Regulatory compliance for pharmaceutical research
Sensitivity analysis for missing data assumptions
The Prediction Model Builder supports multiple modeling approaches:
Basic Clinical Models: Core demographic and primary risk factors
Enhanced Clinical Models: Extended clinical variables and interactions
Biomarker Models: Integration of laboratory values and advanced diagnostics
Custom Models: User-defined variable combinations
Key features include:
Automatic data splitting for unbiased validation
Advanced missing data handling with multiple imputation
Comprehensive performance metrics (AUC, calibration, NRI, IDI)
Cross-validation and bootstrap validation
Stepwise selection and penalized regression
Seamless integration with Decision Curve Analysis
Clinical risk score generation
Robust error handling and validation
Examples
if (FALSE) { # \dontrun{
# Basic pattern analysis
result <- missingdata(
data = clinical_data,
analysis_vars = c("age", "bmi", "biomarker"),
analysis_type = "pattern"
)
# Multiple imputation
result <- missingdata(
data = clinical_data,
analysis_vars = c("age", "bmi", "biomarker"),
analysis_type = "imputation",
n_imputations = 10,
imputation_method = "pmm"
)
# Complete analysis
result <- missingdata(
data = clinical_data,
analysis_vars = c("age", "bmi", "biomarker"),
analysis_type = "complete",
n_imputations = 5,
max_iterations = 10
)
} # }
if (FALSE) { # \dontrun{
# Basic clinical model
result <- modelbuilder(
data = clinical_data,
outcome = "cardiovascular_event",
outcomePositive = "Yes",
basicPredictors = c("age", "sex", "diabetes"),
buildBasicModel = TRUE
)
# Enhanced model with biomarkers
result <- modelbuilder(
data = clinical_data,
outcome = "cardiovascular_event",
outcomePositive = "Yes",
biomarkerPredictors = c("age", "sex", "diabetes", "troponin"),
buildBiomarkerModel = TRUE,
crossValidation = TRUE,
splitData = TRUE
)
} # }