Coefficient Plots for Regression Models
Professional Forest Plots for Clinical Research
ClinicoPath Module
2025-07-13
Source:vignettes/clinicopath-descriptives-16-coefficient-plots.Rmd
clinicopath-descriptives-16-coefficient-plots.Rmd
Introduction
The coefplot
function in ClinicoPath creates
professional coefficient plots (forest plots) for regression models.
These visualizations are essential for presenting regression results in
clinical research, epidemiological studies, and statistical reports.
Clinical Motivation
Coefficient plots are crucial in clinical research for:
- Effect Visualization: Clear presentation of treatment effects and confidence intervals
- Comparative Analysis: Comparing effect sizes across multiple predictors
-
Publication Quality: Professional plots suitable
for journals and presentations
- Model Interpretation: Understanding the relative importance of predictors
- Risk Communication: Presenting odds ratios, hazard ratios, and effect sizes to clinicians
- Meta-Analysis: Displaying pooled estimates and individual study effects
This function supports multiple regression types and provides extensive customization options for clinical publications.
Features Overview
Package Setup
## Warning: replacing previous import 'dplyr::as_data_frame' by
## 'igraph::as_data_frame' when loading 'ClinicoPath'
## Warning: replacing previous import 'DiagrammeR::count_automorphisms' by
## 'igraph::count_automorphisms' when loading 'ClinicoPath'
## Warning: replacing previous import 'dplyr::groups' by 'igraph::groups' when
## loading 'ClinicoPath'
## Warning: replacing previous import 'DiagrammeR::get_edge_ids' by
## 'igraph::get_edge_ids' when loading 'ClinicoPath'
## Warning: replacing previous import 'dplyr::union' by 'igraph::union' when
## loading 'ClinicoPath'
## Warning: replacing previous import 'dplyr::select' by 'jmvcore::select' when
## loading 'ClinicoPath'
## Warning: replacing previous import 'igraph::union' by 'lubridate::union' when
## loading 'ClinicoPath'
## Warning: replacing previous import 'igraph::%--%' by 'lubridate::%--%' when
## loading 'ClinicoPath'
## Warning: replacing previous import 'cutpointr::tnr' by 'mlr3measures::tnr' when
## loading 'ClinicoPath'
## Warning: replacing previous import 'cutpointr::precision' by
## 'mlr3measures::precision' when loading 'ClinicoPath'
## Warning: replacing previous import 'cutpointr::tn' by 'mlr3measures::tn' when
## loading 'ClinicoPath'
## Warning: replacing previous import 'cutpointr::fnr' by 'mlr3measures::fnr' when
## loading 'ClinicoPath'
## Warning: replacing previous import 'cutpointr::tp' by 'mlr3measures::tp' when
## loading 'ClinicoPath'
## Warning: replacing previous import 'cutpointr::npv' by 'mlr3measures::npv' when
## loading 'ClinicoPath'
## Warning: replacing previous import 'cutpointr::ppv' by 'mlr3measures::ppv' when
## loading 'ClinicoPath'
## Warning: replacing previous import 'cutpointr::auc' by 'mlr3measures::auc' when
## loading 'ClinicoPath'
## Warning: replacing previous import 'cutpointr::tpr' by 'mlr3measures::tpr' when
## loading 'ClinicoPath'
## Warning: replacing previous import 'cutpointr::fn' by 'mlr3measures::fn' when
## loading 'ClinicoPath'
## Warning: replacing previous import 'cutpointr::fp' by 'mlr3measures::fp' when
## loading 'ClinicoPath'
## Warning: replacing previous import 'cutpointr::fpr' by 'mlr3measures::fpr' when
## loading 'ClinicoPath'
## Warning: replacing previous import 'cutpointr::recall' by
## 'mlr3measures::recall' when loading 'ClinicoPath'
## Warning: replacing previous import 'cutpointr::specificity' by
## 'mlr3measures::specificity' when loading 'ClinicoPath'
## Warning: replacing previous import 'cutpointr::sensitivity' by
## 'mlr3measures::sensitivity' when loading 'ClinicoPath'
## Warning: replacing previous import 'igraph::as_data_frame' by
## 'tibble::as_data_frame' when loading 'ClinicoPath'
## Warning: replacing previous import 'igraph::crossing' by 'tidyr::crossing' when
## loading 'ClinicoPath'
## Warning: replacing previous import 'magrittr::extract' by 'tidyr::extract' when
## loading 'ClinicoPath'
## Warning: replacing previous import 'mlr3measures::sensitivity' by
## 'caret::sensitivity' when loading 'ClinicoPath'
## Warning: replacing previous import 'mlr3measures::specificity' by
## 'caret::specificity' when loading 'ClinicoPath'
## Registered S3 methods overwritten by 'useful':
## method from
## autoplot.acf ggfortify
## fortify.acf ggfortify
## fortify.kmeans ggfortify
## fortify.ts ggfortify
## Warning: replacing previous import 'jmvcore::select' by 'dplyr::select' when
## loading 'ClinicoPath'
## Registered S3 methods overwritten by 'ggpp':
## method from
## heightDetails.titleGrob ggplot2
## widthDetails.titleGrob ggplot2
## Warning: replacing previous import 'DataExplorer::plot_histogram' by
## 'grafify::plot_histogram' when loading 'ClinicoPath'
## Warning: replacing previous import 'dplyr::select' by 'jmvcore::select' when
## loading 'ClinicoPath'
## Warning: replacing previous import 'mlr3measures::auc' by 'pROC::auc' when
## loading 'ClinicoPath'
## Warning: replacing previous import 'cutpointr::roc' by 'pROC::roc' when loading
## 'ClinicoPath'
## Warning: replacing previous import 'tibble::view' by 'summarytools::view' when
## loading 'ClinicoPath'
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
Creating Test Data
Let’s create comprehensive datasets for different regression scenarios:
# Load the histopathology dataset
data("histopathology")
# Create additional variables for different model types
clinical_data <- histopathology %>%
mutate(
# Binary outcomes
high_grade = factor(ifelse(Grade %in% c("3", "4"), "High", "Low")),
lymph_node_positive = factor(ifelse(LymphNodeMetastasis == "Present", "Positive", "Negative")),
# Count outcome (simulate node counts)
positive_nodes = ifelse(LymphNodeMetastasis == "Present",
sample(1:5, nrow(histopathology), replace = TRUE), 0),
total_nodes = positive_nodes + sample(5:15, nrow(histopathology), replace = TRUE),
# Continuous biomarker
ki67_score = rnorm(nrow(histopathology), 25, 15),
# Treatment variable
chemotherapy = factor(sample(c("Yes", "No"), nrow(histopathology),
replace = TRUE, prob = c(0.4, 0.6))),
# Continuous outcome (simulated survival time)
death_numeric = ifelse(Death == "DOĞRU", 1, 0),
survival_months = ifelse(death_numeric == 1,
rnorm(nrow(histopathology), 18, 8),
rnorm(nrow(histopathology), 48, 12))
) %>%
# Ensure realistic ranges
mutate(
ki67_score = pmax(0, pmin(100, ki67_score)),
survival_months = pmax(1, pmin(120, survival_months)),
total_nodes = pmax(0, total_nodes)
)
# Display data structure
cat("Clinical data structure:\n")
## Clinical data structure:
str(clinical_data)
## tibble [250 × 46] (S3: tbl_df/tbl/data.frame)
## $ ID : num [1:250] 1 2 3 4 5 6 7 8 9 10 ...
## $ Name : chr [1:250] "Tonisia" "Daniyah" "Naviana" "Daerion" ...
## $ Sex : chr [1:250] "Male" "Female" "Male" "Male" ...
## $ Age : num [1:250] 27 36 65 51 58 53 33 26 25 68 ...
## $ Race : chr [1:250] "White" "White" "White" "White" ...
## $ PreinvasiveComponent: chr [1:250] "Present" "Absent" "Absent" "Absent" ...
## $ LVI : chr [1:250] "Present" "Absent" "Absent" "Present" ...
## $ PNI : chr [1:250] "Absent" "Absent" "Absent" "Absent" ...
## $ LastFollowUpDate : chr [1:250] "2019.10.22 00:00:00" "2019.06.22 00:00:00" "2019.08.22 00:00:00" "2019.03.22 00:00:00" ...
## $ Death : chr [1:250] "YANLIŞ" "DOĞRU" "DOĞRU" "YANLIŞ" ...
## $ Group : chr [1:250] "Control" "Treatment" "Control" "Treatment" ...
## $ Grade : num [1:250] 2 2 1 3 2 2 1 2 3 3 ...
## $ TStage : num [1:250] 4 4 4 4 1 4 2 3 4 4 ...
## $ Anti-X-intensity : num [1:250] 3 2 2 3 3 3 2 2 1 2 ...
## $ Anti-Y-intensity : num [1:250] 1 1 2 3 3 2 2 2 1 3 ...
## $ LymphNodeMetastasis : chr [1:250] "Present" "Absent" "Absent" "Absent" ...
## $ Valid : chr [1:250] "YANLIŞ" "DOĞRU" "YANLIŞ" "DOĞRU" ...
## $ Smoker : chr [1:250] "YANLIŞ" "YANLIŞ" "DOĞRU" "YANLIŞ" ...
## $ Grade_Level : chr [1:250] "high" "low" "low" "high" ...
## $ SurgeryDate : chr [1:250] "2019.07.08 00:00:00" "2019.03.18 00:00:00" "2019.05.18 00:00:00" "2018.10.24 00:00:00" ...
## $ DeathTime : chr [1:250] "Within1Year" "Within1Year" "Within1Year" "Within1Year" ...
## $ int : chr [1:250] "2019-07-08 UTC--2019-10-22 UTC" "2019-03-18 UTC--2019-06-22 UTC" "2019-05-18 UTC--2019-08-22 UTC" "2018-10-24 UTC--2019-03-22 UTC" ...
## $ OverallTime : num [1:250] 3.5 3.1 3.1 4.9 3.3 9.3 6.3 9 5.8 9.9 ...
## $ Outcome : num [1:250] 0 1 1 0 0 0 1 1 1 0 ...
## $ Mortality5yr : chr [1:250] "Alive" "Dead" "Dead" "Alive" ...
## $ Rater 1 : num [1:250] 0 1 1 0 0 0 1 1 1 0 ...
## $ Rater 2 : num [1:250] 0 0 0 0 0 0 0 0 0 0 ...
## $ Rater 3 : num [1:250] 1 1 1 0 1 1 1 1 1 1 ...
## $ Rater A : num [1:250] 3 2 3 3 2 3 1 1 2 1 ...
## $ Rater B : num [1:250] 3 2 3 3 2 3 1 1 2 1 ...
## $ New Test : num [1:250] 0 0 0 0 0 0 1 0 0 0 ...
## $ Golden Standart : num [1:250] 0 0 0 0 0 0 0 0 0 0 ...
## $ MeasurementA : num [1:250] -1.63432 0.37071 0.01585 -1.23584 -0.00141 ...
## $ MeasurementB : num [1:250] 0.611 0.554 0.742 0.622 0.527 ...
## $ Disease Status : chr [1:250] "Ill" "Ill" "Healthy" "Ill" ...
## $ Measurement1 : num [1:250] 0.387 0.829 0.159 2.447 0.847 ...
## $ Measurement2 : num [1:250] 1.8654 0.5425 0.0701 2.4071 0.5564 ...
## $ Outcome2 : chr [1:250] "DOD" "DOOC" "AWD" "AWOD" ...
## $ high_grade : Factor w/ 2 levels "High","Low": 2 2 2 1 2 2 2 2 1 1 ...
## $ lymph_node_positive : Factor w/ 2 levels "Negative","Positive": 2 1 1 1 1 1 2 1 1 1 ...
## $ positive_nodes : num [1:250] 1 0 0 0 0 0 2 0 0 0 ...
## $ total_nodes : num [1:250] 13 6 11 11 9 7 16 13 13 9 ...
## $ ki67_score : num [1:250] 41.2 27.8 48.5 21.6 36.5 ...
## $ chemotherapy : Factor w/ 2 levels "No","Yes": 1 2 1 2 2 1 2 2 2 1 ...
## $ death_numeric : num [1:250] 0 1 1 0 0 0 1 1 1 0 ...
## $ survival_months : num [1:250] 55.5 10.9 16.9 40.5 39.1 ...
cat("\nFirst few rows:\n")
##
## First few rows:
ID | Name | Sex | Age | Race | PreinvasiveComponent | LVI | PNI | LastFollowUpDate | Death | Group | Grade | TStage | Anti-X-intensity | Anti-Y-intensity | LymphNodeMetastasis | Valid | Smoker | Grade_Level | SurgeryDate | DeathTime | int | OverallTime | Outcome | Mortality5yr | Rater 1 | Rater 2 | Rater 3 | Rater A | Rater B | New Test | Golden Standart | MeasurementA | MeasurementB | Disease Status | Measurement1 | Measurement2 | Outcome2 | high_grade | lymph_node_positive | positive_nodes | total_nodes | ki67_score | chemotherapy | death_numeric | survival_months |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Tonisia | Male | 27 | White | Present | Present | Absent | 2019.10.22 00:00:00 | YANLIŞ | Control | 2 | 4 | 3 | 1 | Present | YANLIŞ | YANLIŞ | high | 2019.07.08 00:00:00 | Within1Year | 2019-07-08 UTC–2019-10-22 UTC | 3.5 | 0 | Alive | 0 | 0 | 1 | 3 | 3 | 0 | 0 | -1.6343183 | 0.6114150 | Ill | 0.3866313 | 1.8653753 | DOD | Low | Positive | 1 | 13 | 41.15623 | No | 0 | 55.51065 |
2 | Daniyah | Female | 36 | White | Absent | Absent | Absent | 2019.06.22 00:00:00 | DOĞRU | Treatment | 2 | 4 | 2 | 1 | Absent | DOĞRU | YANLIŞ | low | 2019.03.18 00:00:00 | Within1Year | 2019-03-18 UTC–2019-06-22 UTC | 3.1 | 1 | Dead | 1 | 0 | 1 | 2 | 2 | 0 | 0 | 0.3707060 | 0.5543858 | Ill | 0.8293803 | 0.5424802 | DOOC | Low | Negative | 0 | 6 | 27.79725 | Yes | 1 | 10.93745 |
3 | Naviana | Male | 65 | White | Absent | Absent | Absent | 2019.08.22 00:00:00 | DOĞRU | Control | 1 | 4 | 2 | 2 | Absent | YANLIŞ | DOĞRU | low | 2019.05.18 00:00:00 | Within1Year | 2019-05-18 UTC–2019-08-22 UTC | 3.1 | 1 | Dead | 1 | 0 | 1 | 3 | 3 | 0 | 0 | 0.0158538 | 0.7423889 | Healthy | 0.1587530 | 0.0700830 | AWD | Low | Negative | 0 | 11 | 48.53737 | No | 1 | 16.93377 |
4 | Daerion | Male | 51 | White | Absent | Present | Absent | 2019.03.22 00:00:00 | YANLIŞ | Treatment | 3 | 4 | 3 | 3 | Absent | DOĞRU | YANLIŞ | high | 2018.10.24 00:00:00 | Within1Year | 2018-10-24 UTC–2019-03-22 UTC | 4.9 | 0 | Alive | 0 | 0 | 0 | 3 | 3 | 0 | 0 | -1.2358443 | 0.6218427 | Ill | 2.4473541 | 2.4071337 | AWOD | High | Negative | 0 | 11 | 21.59864 | Yes | 0 | 40.49713 |
5 | Tamyiah | Female | 58 | Black | Absent | Absent | Absent | 2019.04.22 00:00:00 | YANLIŞ | Treatment | 2 | 1 | 3 | 3 | Absent | DOĞRU | DOĞRU | low | 2019.01.13 00:00:00 | Within1Year | 2019-01-13 UTC–2019-04-22 UTC | 3.3 | 0 | Alive | 0 | 0 | 1 | 2 | 2 | 0 | 0 | -0.0014086 | 0.5267144 | Healthy | 0.8467223 | 0.5564131 | DOD | Low | Negative | 0 | 9 | 36.49542 | Yes | 0 | 39.09133 |
6 | Donnajo | Female | 53 | White | Absent | Present | Present | 2018.12.22 00:00:00 | YANLIŞ | Treatment | 2 | 4 | 3 | 2 | Absent | DOĞRU | YANLIŞ | moderate | 2018.03.14 00:00:00 | Within1Year | 2018-03-14 UTC–2018-12-22 UTC | 9.3 | 0 | Alive | 0 | 0 | 1 | 3 | 3 | 0 | 0 | -0.9754025 | 0.6602662 | Ill | 1.7021467 | 0.9286315 | DOOC | Low | Negative | 0 | 7 | 20.31830 | No | 0 | 67.95063 |
Linear Regression Coefficient Plots
Example 1: Basic Linear Regression
Analyze predictors of a continuous biomarker (Ki-67 score):
# Basic linear regression coefficient plot
coefplot(
data = clinical_data,
dep = "ki67_score",
covs = c("Age", "TStage", "Grade"),
model_type = "linear",
show_coefficient_plot = TRUE,
show_model_summary = TRUE,
show_coefficient_table = FALSE
)
Clinical Interpretation
Coefficient Interpretation: - Positive
coefficients: Higher predictor values associated with higher
Ki-67 scores - Negative coefficients: Higher predictor
values associated with lower Ki-67 scores
- Confidence intervals: 95% CI crossing zero indicate
non-significant effects - Effect magnitude: Larger
absolute coefficients indicate stronger effects
Example 2: Comprehensive Linear Model
Analyze multiple predictors with customization:
# Comprehensive linear regression analysis
coefplot(
data = clinical_data,
dep = "survival_months",
covs = c("Age", "TStage", "Grade", "LVI", "PNI", "ki67_score"),
model_type = "linear",
ci_level = 0.95,
inner_ci_level = 0.8,
include_intercept = FALSE,
custom_title = "Predictors of Survival Time",
custom_x_label = "Effect on Survival (months)",
show_coefficient_plot = TRUE,
show_model_summary = TRUE,
show_coefficient_table = TRUE
)
Logistic Regression (Odds Ratios)
Example 3: Binary Outcome Analysis
Analyze predictors of high-grade tumors:
Example 4: Lymph Node Involvement
Predict lymph node positivity:
# Predictors of lymph node involvement
coefplot(
data = clinical_data,
dep = "lymph_node_positive",
covs = c("Age", "TStage", "Grade", "LVI", "ki67_score", "chemotherapy"),
model_type = "logistic",
custom_title = "Predictors of Lymph Node Involvement",
custom_x_label = "Odds Ratio",
point_size = 4,
line_thickness = 1.5,
show_coefficient_plot = TRUE,
show_coefficient_table = TRUE
)
Clinical Applications
This analysis helps identify: - High-risk patients: Those likely to have node-positive disease - Treatment planning: Inform decisions about lymph node dissection - Prognostic factors: Understand disease biology and progression - Risk stratification: Personalize treatment approaches
Cox Regression (Hazard Ratios)
Example 5: Survival Analysis
Analyze predictors of mortality risk:
# Cox proportional hazards model
coefplot(
data = clinical_data,
dep = "Death",
time_var = "OverallTime",
covs = c("Age", "TStage", "Grade", "LVI", "PNI"),
model_type = "cox",
ci_level = 0.95,
custom_title = "Mortality Risk Factors",
custom_x_label = "Hazard Ratio",
show_coefficient_plot = TRUE,
show_model_summary = TRUE,
show_coefficient_table = TRUE
)
Example 6: Treatment Effect Analysis
Evaluate treatment efficacy:
# Treatment effect on survival
coefplot(
data = clinical_data,
dep = "Death",
time_var = "OverallTime",
covs = c("chemotherapy", "TStage", "Grade", "Age"),
model_type = "cox",
coef_selection = "specific",
specific_coefs = "chemotherapy, TStage, Grade",
custom_title = "Treatment Effect on Survival",
show_coefficient_plot = TRUE,
show_model_summary = TRUE
)
Poisson Regression (Rate Ratios)
Example 7: Count Outcome Analysis
Analyze predictors of total lymph node count:
# Poisson regression for count data
coefplot(
data = clinical_data,
dep = "total_nodes",
covs = c("Age", "TStage", "Grade", "LVI"),
model_type = "poisson",
custom_title = "Predictors of Total Node Count",
custom_x_label = "Rate Ratio",
show_coefficient_plot = TRUE,
show_model_summary = TRUE,
show_coefficient_table = TRUE
)
Advanced Customization
Example 8: Publication-Ready Plot
Create a professional plot for publication:
# High-quality publication plot
coefplot(
data = clinical_data,
dep = "high_grade",
covs = c("Age", "TStage", "Grade", "LVI", "PNI", "ki67_score"),
model_type = "logistic",
ci_level = 0.95,
inner_ci_level = 0.8,
include_intercept = FALSE,
point_size = 3,
line_thickness = 1.2,
custom_title = "Risk Factors for High-Grade Tumors",
custom_x_label = "Odds Ratio (95% CI)",
show_coefficient_plot = TRUE,
show_model_summary = FALSE,
show_coefficient_table = FALSE
)
Example 9: Coefficient Selection
Focus on specific predictors of interest:
# Select specific coefficients to display
coefplot(
data = clinical_data,
dep = "lymph_node_positive",
covs = c("Age", "TStage", "Grade", "LVI", "PNI", "ki67_score", "chemotherapy"),
model_type = "logistic",
coef_selection = "specific",
specific_coefs = "TStage, Grade, LVI, PNI",
custom_title = "Key Pathological Predictors",
show_coefficient_plot = TRUE,
show_coefficient_table = TRUE
)
Example 10: Excluding Coefficients
Remove confounding or adjustment variables from display:
# Exclude specific coefficients from plot
coefplot(
data = clinical_data,
dep = "high_grade",
covs = c("Age", "TStage", "Grade", "LVI", "PNI", "ki67_score"),
model_type = "logistic",
coef_selection = "exclude",
specific_coefs = "Age", # Remove age (adjustment variable)
custom_title = "Disease-Specific Risk Factors",
show_coefficient_plot = TRUE,
show_model_summary = TRUE
)
Model Comparison and Selection
Example 12: Effect Size Visualization
Emphasize effect magnitudes:
# Focus on effect sizes with larger points
coefplot(
data = clinical_data,
dep = "survival_months",
covs = c("TStage", "Grade", "LVI", "PNI", "ki67_score"),
model_type = "linear",
point_size = 5,
line_thickness = 2,
custom_title = "Effect Sizes on Survival Time",
custom_x_label = "Effect Size (months)",
show_coefficient_plot = TRUE,
show_coefficient_table = TRUE
)
Clinical Interpretation Guidelines
Understanding Coefficient Plots
cat("Coefficient Plot Interpretation Guide:\n\n")
## Coefficient Plot Interpretation Guide:
interpretation_guide <- data.frame(
Model_Type = c("Linear", "Logistic", "Cox", "Poisson"),
Effect_Measure = c("Coefficient", "Odds Ratio", "Hazard Ratio", "Rate Ratio"),
Null_Value = c("0", "1", "1", "1"),
Interpretation = c(
"Units change in outcome per unit predictor",
"Odds multiplication per unit predictor",
"Hazard multiplication per unit predictor",
"Rate multiplication per unit predictor"
),
Clinical_Significance = c(
"Direct effect size",
"Risk factor strength",
"Survival impact",
"Count/rate impact"
)
)
kable(interpretation_guide, caption = "Clinical Interpretation of Coefficient Plots")
Model_Type | Effect_Measure | Null_Value | Interpretation | Clinical_Significance |
---|---|---|---|---|
Linear | Coefficient | 0 | Units change in outcome per unit predictor | Direct effect size |
Logistic | Odds Ratio | 1 | Odds multiplication per unit predictor | Risk factor strength |
Cox | Hazard Ratio | 1 | Hazard multiplication per unit predictor | Survival impact |
Poisson | Rate Ratio | 1 | Rate multiplication per unit predictor | Count/rate impact |
Statistical Significance Assessment
cat("Statistical Significance Guidelines:\n\n")
## Statistical Significance Guidelines:
cat("🔴 SIGNIFICANT EFFECTS:\n")
## 🔴 SIGNIFICANT EFFECTS:
cat(" • Confidence intervals do not cross null value\n")
## • Confidence intervals do not cross null value
cat(" • Strong evidence for association\n")
## • Strong evidence for association
cat(" • Consider clinical significance alongside statistical significance\n\n")
## • Consider clinical significance alongside statistical significance
cat("🟡 BORDERLINE EFFECTS:\n")
## 🟡 BORDERLINE EFFECTS:
cat(" • Confidence intervals barely cross null value\n")
## • Confidence intervals barely cross null value
cat(" • May warrant further investigation\n")
## • May warrant further investigation
cat(" • Consider sample size and power\n\n")
## • Consider sample size and power
cat("🟢 NON-SIGNIFICANT EFFECTS:\n")
## 🟢 NON-SIGNIFICANT EFFECTS:
cat(" • Confidence intervals clearly include null value\n")
## • Confidence intervals clearly include null value
cat(" • Insufficient evidence for association\n")
## • Insufficient evidence for association
cat(" • May still have clinical relevance if effect size is meaningful\n\n")
## • May still have clinical relevance if effect size is meaningful
cat("📊 EFFECT SIZE CONSIDERATIONS:\n")
## 📊 EFFECT SIZE CONSIDERATIONS:
cat(" • Large effect with wide CI: Potentially important but uncertain\n")
## • Large effect with wide CI: Potentially important but uncertain
cat(" • Small effect with narrow CI: Precise but may not be clinically meaningful\n")
## • Small effect with narrow CI: Precise but may not be clinically meaningful
cat(" • Consider clinical context and domain expertise\n")
## • Consider clinical context and domain expertise
Best Practices and Guidelines
Model Selection Guidelines
cat("Model Type Selection Guidelines:\n\n")
## Model Type Selection Guidelines:
model_selection_guide <- data.frame(
Outcome_Type = c("Continuous", "Binary", "Time-to-Event", "Count/Rate"),
Model_Choice = c("Linear", "Logistic", "Cox", "Poisson"),
Example_Outcomes = c(
"Biomarker levels, scores, measurements",
"Disease presence, treatment response",
"Survival time, time to recurrence",
"Number of events, lesion counts"
),
Key_Assumptions = c(
"Linearity, normality, homoscedasticity",
"Logit linearity, independence",
"Proportional hazards, independent censoring",
"Mean equals variance, independence"
)
)
kable(model_selection_guide, caption = "Choosing the Right Regression Model")
Outcome_Type | Model_Choice | Example_Outcomes | Key_Assumptions |
---|---|---|---|
Continuous | Linear | Biomarker levels, scores, measurements | Linearity, normality, homoscedasticity |
Binary | Logistic | Disease presence, treatment response | Logit linearity, independence |
Time-to-Event | Cox | Survival time, time to recurrence | Proportional hazards, independent censoring |
Count/Rate | Poisson | Number of events, lesion counts | Mean equals variance, independence |
Publication Standards
cat("Publication-Ready Coefficient Plots:\n\n")
## Publication-Ready Coefficient Plots:
cat("✓ ESSENTIAL ELEMENTS:\n")
## ✓ ESSENTIAL ELEMENTS:
cat(" • Clear, informative title\n")
## • Clear, informative title
cat(" • Appropriate axis labels with units\n")
## • Appropriate axis labels with units
cat(" • Confidence intervals displayed\n")
## • Confidence intervals displayed
cat(" • Reference line at null value\n")
## • Reference line at null value
cat(" • Legend explaining symbols and intervals\n\n")
## • Legend explaining symbols and intervals
cat("✓ REPORTING REQUIREMENTS:\n")
## ✓ REPORTING REQUIREMENTS:
cat(" • Sample size and missing data handling\n")
## • Sample size and missing data handling
cat(" • Model assumptions and diagnostics\n")
## • Model assumptions and diagnostics
cat(" • Confidence interval levels\n")
## • Confidence interval levels
cat(" • Adjustment variables included\n")
## • Adjustment variables included
cat(" • Software and package versions\n\n")
## • Software and package versions
cat("✓ VISUAL QUALITY:\n")
## ✓ VISUAL QUALITY:
cat(" • High resolution for print (≥300 DPI)\n")
## • High resolution for print (≥300 DPI)
cat(" • Readable font sizes\n")
## • Readable font sizes
cat(" • Clear contrast and colors\n")
## • Clear contrast and colors
cat(" • Consistent formatting across figures\n")
## • Consistent formatting across figures
Clinical Case Studies
Case Study 1: Biomarker Validation
Validate a new prognostic biomarker:
# Biomarker validation study
coefplot(
data = clinical_data,
dep = "Death",
time_var = "OverallTime",
covs = c("ki67_score", "TStage", "Grade", "Age"),
model_type = "cox",
custom_title = "Ki-67 as Prognostic Biomarker",
custom_x_label = "Hazard Ratio (95% CI)",
show_coefficient_plot = TRUE,
show_model_summary = TRUE,
show_coefficient_table = TRUE
)
Case Study 2: Treatment Decision Making
Inform treatment selection:
# Treatment decision support
coefplot(
data = clinical_data,
dep = "lymph_node_positive",
covs = c("TStage", "Grade", "LVI", "PNI", "ki67_score"),
model_type = "logistic",
custom_title = "Pre-operative Prediction of Node Involvement",
custom_x_label = "Odds Ratio",
show_coefficient_plot = TRUE,
show_coefficient_table = TRUE
)
Case Study 3: Risk Prediction Model
Develop clinical prediction tool:
# Comprehensive risk prediction model
coefplot(
data = clinical_data,
dep = "high_grade",
covs = c("Age", "TStage", "LVI", "PNI", "ki67_score"),
model_type = "logistic",
include_intercept = TRUE, # Needed for prediction
custom_title = "High-Grade Tumor Prediction Model",
show_coefficient_plot = TRUE,
show_model_summary = TRUE,
show_coefficient_table = TRUE
)
Model Development
This analysis supports: - Risk calculator development: Convert coefficients to prediction tool - Validation planning: Design external validation studies - Clinical implementation: Integration into electronic health records - Performance assessment: Discrimination and calibration evaluation
Advanced Statistical Considerations
Model Diagnostics and Assumptions
cat("Model Diagnostic Considerations:\n\n")
## Model Diagnostic Considerations:
cat("🔍 LINEAR REGRESSION:\n")
## 🔍 LINEAR REGRESSION:
cat(" • Linearity: Scatter plots of predictors vs. outcome\n")
## • Linearity: Scatter plots of predictors vs. outcome
cat(" • Residual plots: Check for patterns and outliers\n")
## • Residual plots: Check for patterns and outliers
cat(" • Normality: Q-Q plots of residuals\n")
## • Normality: Q-Q plots of residuals
cat(" • Multicollinearity: Variance inflation factors\n\n")
## • Multicollinearity: Variance inflation factors
cat("🔍 LOGISTIC REGRESSION:\n")
## 🔍 LOGISTIC REGRESSION:
cat(" • Linearity in logit: Smooth terms or polynomial checks\n")
## • Linearity in logit: Smooth terms or polynomial checks
cat(" • Outliers: Leverage and influence diagnostics\n")
## • Outliers: Leverage and influence diagnostics
cat(" • Goodness of fit: Hosmer-Lemeshow test\n")
## • Goodness of fit: Hosmer-Lemeshow test
cat(" • Calibration: Calibration plots\n\n")
## • Calibration: Calibration plots
cat("🔍 COX REGRESSION:\n")
## 🔍 COX REGRESSION:
cat(" • Proportional hazards: Schoenfeld residuals\n")
## • Proportional hazards: Schoenfeld residuals
cat(" • Linearity: Martingale residuals\n")
## • Linearity: Martingale residuals
cat(" • Outliers: Deviance residuals\n")
## • Outliers: Deviance residuals
cat(" • Time-varying effects: Test for interactions with time\n\n")
## • Time-varying effects: Test for interactions with time
cat("🔍 POISSON REGRESSION:\n")
## 🔍 POISSON REGRESSION:
cat(" • Overdispersion: Compare variance to mean\n")
## • Overdispersion: Compare variance to mean
cat(" • Zero inflation: Assess excess zeros\n")
## • Zero inflation: Assess excess zeros
cat(" • Linearity: Residual plots\n")
## • Linearity: Residual plots
cat(" • Independence: Autocorrelation checks\n")
## • Independence: Autocorrelation checks
Effect Size and Clinical Significance
cat("Effect Size Interpretation:\n\n")
## Effect Size Interpretation:
effect_size_guide <- data.frame(
Effect_Measure = c("Cohen's d", "Odds Ratio", "Hazard Ratio", "Rate Ratio"),
Small_Effect = c("0.2", "1.2-1.5", "1.2-1.5", "1.2-1.5"),
Medium_Effect = c("0.5", "2.0-3.0", "2.0-3.0", "2.0-3.0"),
Large_Effect = c("0.8", ">4.0", ">4.0", ">4.0"),
Clinical_Relevance = c(
"Standardized mean difference",
"Multiplicative risk increase",
"Proportional hazard increase",
"Rate multiplication factor"
)
)
kable(effect_size_guide, caption = "Effect Size Interpretation Guidelines")
Effect_Measure | Small_Effect | Medium_Effect | Large_Effect | Clinical_Relevance |
---|---|---|---|---|
Cohen’s d | 0.2 | 0.5 | 0.8 | Standardized mean difference |
Odds Ratio | 1.2-1.5 | 2.0-3.0 | >4.0 | Multiplicative risk increase |
Hazard Ratio | 1.2-1.5 | 2.0-3.0 | >4.0 | Proportional hazard increase |
Rate Ratio | 1.2-1.5 | 2.0-3.0 | >4.0 | Rate multiplication factor |
Sample Size and Power Considerations
cat("Sample Size Considerations for Coefficient Plots:\n\n")
## Sample Size Considerations for Coefficient Plots:
cat("📊 GENERAL GUIDELINES:\n")
## 📊 GENERAL GUIDELINES:
cat(" • Minimum 10-15 events per predictor (EPV rule)\n")
## • Minimum 10-15 events per predictor (EPV rule)
cat(" • Larger samples for stable coefficient estimates\n")
## • Larger samples for stable coefficient estimates
cat(" • Consider effect size and desired precision\n")
## • Consider effect size and desired precision
cat(" • Account for missing data and exclusions\n\n")
## • Account for missing data and exclusions
cat("📊 MODEL-SPECIFIC REQUIREMENTS:\n")
## 📊 MODEL-SPECIFIC REQUIREMENTS:
cat(" • Linear: n ≥ 100 + 10×predictors for stable estimates\n")
## • Linear: n ≥ 100 + 10×predictors for stable estimates
cat(" • Logistic: ≥10 events per predictor minimum\n")
## • Logistic: ≥10 events per predictor minimum
cat(" • Cox: ≥10 events per predictor minimum\n")
## • Cox: ≥10 events per predictor minimum
cat(" • Poisson: Consider event rate and zero inflation\n\n")
## • Poisson: Consider event rate and zero inflation
cat("📊 PRECISION CONSIDERATIONS:\n")
## 📊 PRECISION CONSIDERATIONS:
cat(" • Narrow CIs require larger samples\n")
## • Narrow CIs require larger samples
cat(" • Rare outcomes need larger samples\n")
## • Rare outcomes need larger samples
cat(" • Multiple comparisons reduce effective power\n")
## • Multiple comparisons reduce effective power
cat(" • Subgroup analyses require additional power\n")
## • Subgroup analyses require additional power
Output Integration and Reporting
Comprehensive Analysis Workflow
# Complete analysis workflow example
clinical_analysis <- function(data, outcome_var, predictors, model_type) {
# Step 1: Descriptive analysis
cat("=== DESCRIPTIVE ANALYSIS ===\n")
# (Add descriptive statistics here)
# Step 2: Coefficient plot
cat("\n=== COEFFICIENT PLOT ===\n")
coefplot(
data = data,
dep = outcome_var,
covs = predictors,
model_type = model_type,
show_coefficient_plot = TRUE,
show_model_summary = TRUE,
show_coefficient_table = TRUE
)
# Step 3: Model diagnostics
cat("\n=== MODEL DIAGNOSTICS ===\n")
# (Add diagnostic plots and tests here)
# Step 4: Clinical interpretation
cat("\n=== CLINICAL INTERPRETATION ===\n")
# (Add interpretation guidelines here)
}
# Example usage
# clinical_analysis(clinical_data, "high_grade",
# c("Age", "TStage", "Grade", "LVI"), "logistic")
Integration with Other ClinicoPath Functions
cat("Integration with ClinicoPath Workflow:\n\n")
## Integration with ClinicoPath Workflow:
cat("🔄 TYPICAL ANALYSIS SEQUENCE:\n")
## 🔄 TYPICAL ANALYSIS SEQUENCE:
cat(" 1. tableone() - Descriptive statistics by groups\n")
## 1. tableone() - Descriptive statistics by groups
cat(" 2. summarydata() - Overall data summary\n")
## 2. summarydata() - Overall data summary
cat(" 3. coefplot() - Regression coefficient visualization\n")
## 3. coefplot() - Regression coefficient visualization
cat(" 4. survival() - Survival analysis (if applicable)\n")
## 4. survival() - Survival analysis (if applicable)
cat(" 5. roc() - Diagnostic performance (if applicable)\n\n")
## 5. roc() - Diagnostic performance (if applicable)
cat("🔄 COMPLEMENTARY FUNCTIONS:\n")
## 🔄 COMPLEMENTARY FUNCTIONS:
cat(" • crosstable() - Univariate associations\n")
## • crosstable() - Univariate associations
cat(" • correlation() - Predictor relationships\n")
## • correlation() - Predictor relationships
cat(" • nomogram() - Prediction model visualization\n")
## • nomogram() - Prediction model visualization
cat(" • forest plots from meta-analysis functions\n\n")
## • forest plots from meta-analysis functions
cat("🔄 QUALITY ASSURANCE:\n")
## 🔄 QUALITY ASSURANCE:
cat(" • checkdata() - Data quality assessment\n")
## • checkdata() - Data quality assessment
cat(" • outlierdetection() - Identify unusual observations\n")
## • outlierdetection() - Identify unusual observations
cat(" • missingdata() - Handle missing data patterns\n")
## • missingdata() - Handle missing data patterns
Troubleshooting and Common Issues
Error Prevention and Solutions
cat("Common Issues and Solutions:\n\n")
## Common Issues and Solutions:
cat("❌ ERROR: 'Package not found'\n")
## ❌ ERROR: 'Package not found'
cat(" SOLUTION: Install required packages\n")
## SOLUTION: Install required packages
cat(" install.packages(c('coefplot', 'jtools', 'survival'))\n\n")
## install.packages(c('coefplot', 'jtools', 'survival'))
cat("❌ ERROR: 'Binary variable must have exactly 2 levels'\n")
## ❌ ERROR: 'Binary variable must have exactly 2 levels'
cat(" SOLUTION: Check outcome variable formatting\n")
## SOLUTION: Check outcome variable formatting
cat(" • Remove missing values: data$outcome[!is.na(data$outcome)]\n")
## • Remove missing values: data$outcome[!is.na(data$outcome)]
cat(" • Convert to factor: factor(data$outcome)\n")
## • Convert to factor: factor(data$outcome)
cat(" • Check unique values: unique(data$outcome)\n\n")
## • Check unique values: unique(data$outcome)
cat("❌ ERROR: 'Cox regression requires time variable'\n")
## ❌ ERROR: 'Cox regression requires time variable'
cat(" SOLUTION: Specify time_var parameter\n")
## SOLUTION: Specify time_var parameter
cat(" • Ensure time variable is numeric and positive\n")
## • Ensure time variable is numeric and positive
cat(" • Check for missing values in time variable\n\n")
## • Check for missing values in time variable
cat("❌ WARNING: 'Convergence issues'\n")
## ❌ WARNING: 'Convergence issues'
cat(" SOLUTION: Check data quality and model specification\n")
## SOLUTION: Check data quality and model specification
cat(" • Remove highly correlated predictors\n")
## • Remove highly correlated predictors
cat(" • Check for complete separation in logistic models\n")
## • Check for complete separation in logistic models
cat(" • Consider variable transformations\n")
## • Consider variable transformations
cat(" • Ensure adequate sample size\n\n")
## • Ensure adequate sample size
cat("❌ ERROR: 'Plot not displaying'\n")
## ❌ ERROR: 'Plot not displaying'
cat(" SOLUTION: Check output options\n")
## SOLUTION: Check output options
cat(" • Ensure show_coefficient_plot = TRUE\n")
## • Ensure show_coefficient_plot = TRUE
cat(" • Check that variables are properly selected\n")
## • Check that variables are properly selected
cat(" • Verify data contains the specified variables\n")
## • Verify data contains the specified variables
Data Preparation Guidelines
cat("Data Preparation Checklist:\n\n")
## Data Preparation Checklist:
cat("✓ VARIABLE FORMATTING:\n")
## ✓ VARIABLE FORMATTING:
cat(" • Outcomes: Proper type (numeric, factor, Surv object)\n")
## • Outcomes: Proper type (numeric, factor, Surv object)
cat(" • Predictors: Appropriate scale and distribution\n")
## • Predictors: Appropriate scale and distribution
cat(" • Factors: Meaningful reference levels\n")
## • Factors: Meaningful reference levels
cat(" • Missing data: Handle systematically\n\n")
## • Missing data: Handle systematically
cat("✓ MODEL ASSUMPTIONS:\n")
## ✓ MODEL ASSUMPTIONS:
cat(" • Check linearity assumptions\n")
## • Check linearity assumptions
cat(" • Assess multicollinearity\n")
## • Assess multicollinearity
cat(" • Verify independence\n")
## • Verify independence
cat(" • Test proportional hazards (Cox models)\n\n")
## • Test proportional hazards (Cox models)
cat("✓ SAMPLE SIZE:\n")
## ✓ SAMPLE SIZE:
cat(" • Adequate events per variable\n")
## • Adequate events per variable
cat(" • Consider power for detecting effects\n")
## • Consider power for detecting effects
cat(" • Account for missing data\n")
## • Account for missing data
cat(" • Plan for model validation\n")
## • Plan for model validation
Conclusion
The coefplot
function provides comprehensive coefficient
visualization capabilities essential for clinical research and
statistical reporting. Key benefits include:
- Professional Visualization: Publication-ready forest plots for all major regression types
- Clinical Interpretation: Clear presentation of effect sizes and confidence intervals
- Flexible Customization: Extensive options for specific research needs
- Statistical Rigor: Proper handling of different model types and assumptions
- Integration: Seamless workflow with other ClinicoPath functions
This tool enables researchers to effectively communicate regression results, support clinical decision-making, and advance evidence-based medicine through clear statistical visualization.
Best Practice Summary
- Choose appropriate model type based on outcome variable characteristics
- Include relevant covariates but avoid overfitting with too many predictors
- Check model assumptions and perform diagnostic analyses
- Use confidence intervals to assess statistical and clinical significance
- Customize plots for target audience and publication requirements
- Provide clinical interpretation alongside statistical results
References
- Harrell, F. E. (2015). Regression Modeling Strategies. Springer.
- Hosmer, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied Logistic Regression. Wiley.
- Therneau, T. M., & Grambsch, P. M. (2000). Modeling Survival Data. Springer.
- Agresti, A. (2013). Categorical Data Analysis. Wiley.
This vignette was created for the ClinicoPath jamovi module. For more information and updates, visit https://github.com/sbalci/ClinicoPathJamoviModule.