Skip to contents

Usage

geemodel(
  data,
  outcome,
  predictors,
  cluster_id,
  time_var,
  family = "gaussian",
  corstr = "exchangeable",
  robust_se = TRUE,
  conf_level = 0.95,
  qic = TRUE,
  posthoc = FALSE,
  posthoc_adjust = "holm",
  diagnostics = TRUE,
  correlation_plot = FALSE
)

Arguments

data

the data as a data frame

outcome

outcome variable (dependent variable); can be continuous, binary, or count

predictors

predictor variables (independent variables) for the model

cluster_id

variable identifying clusters (e.g., patient ID, dog ID). Observations with the same ID are treated as correlated.

time_var

for longitudinal data, specifies time ordering within clusters. Used with AR(1) correlation structure.

family

distribution family and link function for the outcome variable. Gaussian for continuous, Binomial for binary, Poisson for counts.

corstr

Working correlation structure within clusters: - Exchangeable: constant correlation between all pairs (recommended starting point) - AR(1): correlation decays with time lag (for longitudinal data) - Unstructured: estimates all pairwise correlations (requires many observations) - Independence: no correlation (equivalent to GLM)

robust_se

sandwich estimator for standard errors (recommended). Provides valid inference even if correlation structure is misspecified.

conf_level

confidence level for confidence intervals (default 95\

qicQuasi-likelihood under Independence Model Criterion. Lower QIC indicates better model fit. Use for comparing models.

posthoccompare levels of categorical predictors using estimated marginal means

posthoc_adjustmultiple testing correction for post-hoc comparisons

diagnosticsdisplay model diagnostics including number of clusters, observations per cluster

correlation_plotvisualize the estimated within-cluster correlation structure

A results object containing:

results$instructionsa html
results$modelInfoa table
results$coefficientsTablea table
results$qicTablea table
results$posthocTablea table
results$diagnosticsTablea table
results$correlationPlotan image
results$methodologyNotea html
results$interpretationNotea html
Tables can be converted to data frames with asDF or as.data.frame. For example:results$modelInfo$asDFas.data.frame(results$modelInfo) Generalized Estimating Equations for analyzing correlated/clustered data. Handles repeated measures, longitudinal data, multi-site studies with marginal model approach. Essential for pathology studies with multiple samples per subject (e.g., bilateral organs, multiple biopsies per patient). Key Features:
  • Handles binary, continuous, and count outcomes

  • Multiple correlation structures (exchangeable, AR-1, unstructured)

  • Robust standard errors (sandwich estimator)

  • Post-hoc pairwise comparisons

  • Model selection with QIC

Clinical Applications:
  • Multiple samples per patient

  • Bilateral organs (eyes, kidneys)

  • Repeated measures over time

  • Multi-site studies

  • Clustered randomized trials

# Example: Multiple liver samples per dog geemodel( data = liver_data, outcome = 'diagnosis', predictors = c('sample_method', 'fibrosis_score'), cluster_id = 'dog_id', family = 'binomial', corstr = 'exchangeable', robust_se = TRUE )