Comprehensive Bar Chart Analysis with jjbarstats

Introduction to jjbarstats

The jjbarstats function is a powerful wrapper around the ggstatsplot package that creates publication-ready bar charts with automatic statistical testing. This function is designed specifically for analyzing categorical data relationships in clinical and research settings.

Key Features

Automatic Statistical Testing: Performs chi-squared tests, Fisher’s exact tests, or other appropriate tests based on your data
Multiple Variable Support: Handle single or multiple dependent variables simultaneously
Grouped Analysis: Split analysis by additional grouping variables
Flexible Statistical Methods: Choose from parametric, non-parametric, robust, or Bayesian approaches
Pairwise Comparisons: Automatic post-hoc testing with multiple comparison correction
Professional Visualization: Publication-ready plots with statistical annotations

When to Use jjbarstats

Use jjbarstats when you need to:

Compare proportions across different groups
Analyze treatment effectiveness in clinical trials
Examine relationships between categorical variables
Create publication-ready visualizations with statistical tests
Perform quality improvement analysis
Analyze survey responses and patient feedback

Basic Usage

Single Dependent Variable Analysis

Let’s start with a basic example using medical study data to compare treatment responses across different treatment groups:

# Basic bar chart comparing treatment response across groups
jjbarstats(
  data = medical_study_data,
  dep = response,
  group = treatment_group,
  grvar = NULL
)

This creates a bar chart showing the distribution of treatment responses (Complete Response, Partial Response, No Response) across different treatment groups (Control, Treatment A, Treatment B), along with chi-squared test results.

Multiple Dependent Variables

You can analyze multiple dependent variables simultaneously:

# Analyze multiple outcomes in clinical trial
# jjbarstats(
#   data = clinical_trial_data,
#   dep = c(primary_outcome, side_effects),
#   group = drug_dosage,
#   grvar = NULL
# )

Advanced Statistical Options

Different Statistical Methods

The function supports four different statistical approaches:

Parametric Analysis (Default)

jjbarstats(
  data = patient_satisfaction_data,
  dep = satisfaction_level,
  group = service_type,
  typestatistics = "parametric"
)

Non-parametric Analysis

jjbarstats(
  data = diagnostic_test_data,
  dep = test_result,
  group = test_method,
  typestatistics = "nonparametric"
)

Robust Analysis

jjbarstats(
  data = quality_improvement_data,
  dep = implementation_status,
  group = improvement_category,
  typestatistics = "robust"
)

Bayesian Analysis

jjbarstats(
  data = medical_study_data,
  dep = severity,
  group = treatment_group,
  typestatistics = "bayes"
)

Grouped Analysis with Splitting Variables

Using the grvar Parameter

The grvar parameter allows you to split your analysis by an additional grouping variable, creating separate plots for each level:

# Analyze treatment response by treatment group, split by gender
jjbarstats(
  data = medical_study_data,
  dep = response,
  group = treatment_group,
  grvar = gender
)

This creates separate bar charts for male and female patients, allowing you to examine whether treatment effects differ by gender.

Complex Grouped Analysis

# Patient satisfaction by service type, split by department
jjbarstats(
  data = patient_satisfaction_data,
  dep = satisfaction_level,
  group = service_type,
  grvar = department
)

Pairwise Comparisons and Multiple Testing

Enabling Pairwise Comparisons

When you have more than two groups, pairwise comparisons help identify which specific groups differ:

# jjbarstats(
#   data = clinical_trial_data,
#   dep = primary_outcome,
#   group = drug_dosage,
#   pairwisecomparisons = TRUE,
#   padjustmethod = "holm"
# )

Multiple Comparison Correction Methods

Different correction methods are available to control for multiple testing:

# Bonferroni correction (most conservative)
jjbarstats(
  data = diagnostic_test_data,
  dep = test_result,
  group = laboratory,
  pairwisecomparisons = TRUE,
  padjustmethod = "bonferroni"
)

# Benjamini-Hochberg correction (controls false discovery rate)
jjbarstats(
  data = quality_improvement_data,
  dep = priority_level,
  group = department_involved,
  pairwisecomparisons = TRUE,
  padjustmethod = "BH"
)

Controlling Pairwise Display

You can control which pairwise comparisons are displayed:

# Show only significant comparisons
jjbarstats(
  data = medical_study_data,
  dep = response,
  group = treatment_group,
  pairwisecomparisons = TRUE,
  pairwisedisplay = "significant"
)

# Show all comparisons
jjbarstats(
  data = patient_satisfaction_data,
  dep = staff_rating,
  group = service_type,
  pairwisecomparisons = TRUE,
  pairwisedisplay = "everything"
)

Real-World Clinical Applications

Treatment Efficacy Analysis

Analyzing treatment effectiveness across different patient subgroups:

# Comprehensive treatment analysis
jjbarstats(
  data = medical_study_data,
  dep = response,
  group = treatment_group,
  grvar = severity,
  typestatistics = "nonparametric",
  pairwisecomparisons = TRUE,
  padjustmethod = "BH"
)

Quality Improvement Analysis

Tracking implementation status across different improvement categories:

jjbarstats(
  data = quality_improvement_data,
  dep = c(implementation_status, priority_level),
  group = improvement_category,
  typestatistics = "parametric",
  pairwisecomparisons = TRUE
)

Diagnostic Test Evaluation

Comparing test performance across different methods and laboratories:

jjbarstats(
  data = diagnostic_test_data,
  dep = test_result,
  group = test_method,
  grvar = laboratory,
  typestatistics = "robust",
  pairwisecomparisons = TRUE,
  pairwisedisplay = "significant"
)

Patient Satisfaction Survey Analysis

Analyzing satisfaction levels across different service types and departments:

jjbarstats(
  data = patient_satisfaction_data,
  dep = satisfaction_level,
  group = service_type,
  grvar = department,
  typestatistics = "nonparametric",
  pairwisecomparisons = TRUE,
  padjustmethod = "holm"
)

Working with Real Histopathology Data

Using the histopathology dataset that comes with ClinicoPath:

# Analyze lymphovascular invasion by treatment group
jjbarstats(
  data = histopathology,
  dep = LVI,
  group = Group,
  typestatistics = "nonparametric",
  pairwisecomparisons = TRUE
)

# Multiple outcome analysis
jjbarstats(
  data = histopathology,
  dep = c(LVI, PNI),
  group = Grade_Level,
  typestatistics = "parametric"
)

# Grouped analysis by sex
jjbarstats(
  data = histopathology,
  dep = LymphNodeMetastasis,
  group = Grade_Level,
  grvar = Sex,
  typestatistics = "robust",
  pairwisecomparisons = TRUE
)

Customization and Theming

Using Original ggstatsplot Theme

jjbarstats(
  data = medical_study_data,
  dep = response,
  group = treatment_group,
  originaltheme = TRUE
)

Custom Theming

The function respects jamovi’s theming system when originaltheme = FALSE (default), allowing for consistent styling across analyses.

Best Practices and Recommendations

Statistical Method Selection

Parametric: Use when data meets assumptions (large sample sizes, expected frequencies ≥ 5)
Non-parametric: Default choice for categorical data, fewer assumptions
Robust: Good middle ground, less sensitive to outliers
Bayesian: When you want to incorporate prior knowledge or report Bayes factors

Multiple Comparison Correction

Holm: Good balance between power and Type I error control
Bonferroni: Most conservative, use when Type I error is critical
BH (Benjamini-Hochberg): Controls false discovery rate, good for exploratory analysis
None: Only when you have specific a priori hypotheses

Sample Size Considerations

Chi-squared tests require expected frequencies ≥ 5 in each cell
Fisher’s exact test is automatically used for small samples
Consider effect sizes, not just p-values

Data Preparation Tips

# Ensure categorical variables are properly formatted
medical_study_clean <- medical_study_data %>%
  mutate(
    treatment_group = factor(treatment_group, 
                           levels = c("Control", "Treatment A", "Treatment B")),
    response = factor(response,
                     levels = c("No Response", "Partial Response", "Complete Response"))
  )

# Verify factor levels
str(medical_study_clean[c("treatment_group", "response")])

Troubleshooting Common Issues

Issue 1: Empty Cells or Small Counts

When you have empty cells or very small counts, the function automatically switches to appropriate tests:

# Create data with small counts
small_sample <- medical_study_data[1:20, ]

jjbarstats(
  data = small_sample,
  dep = response,
  group = treatment_group,
  typestatistics = "nonparametric"
)

Issue 2: Too Many Categories

When you have many categories, consider grouping or using different visualization:

# Example with multiple categories
jjbarstats(
  data = patient_satisfaction_data,
  dep = satisfaction_level,
  group = department,
  pairwisecomparisons = FALSE  # Disable pairwise for clarity
)

Issue 3: Missing Data

The function automatically handles missing data when excl = TRUE (default):

# Demonstrate missing data handling
data_with_na <- medical_study_data
data_with_na$response[1:5] <- NA

jjbarstats(
  data = data_with_na,
  dep = response,
  group = treatment_group,
  excl = TRUE  # Exclude missing values
)

Interpretation Guidelines

Understanding the Statistical Output

Chi-squared test: Tests independence between categorical variables
Effect size (Cramér’s V): Measures strength of association (0 = no association, 1 = perfect association)
Confidence intervals: Provide range of plausible values for the effect
Pairwise comparisons: Show which specific groups differ

Clinical Significance vs Statistical Significance

Always consider clinical relevance alongside statistical significance
Effect sizes help interpret practical importance
Confidence intervals provide information about precision

Reporting Results

When reporting results from jjbarstats:

Describe the statistical test used (chi-squared, Fisher’s exact, etc.)
Report effect size (Cramér’s V) and confidence intervals
Mention multiple comparison correction if applicable
Provide sample sizes for each group
Include the actual plot in your publication

Advanced Examples

Multi-stage Analysis Workflow

# Step 1: Overall analysis
# overall_result <- jjbarstats(
#   data = clinical_trial_data,
#   dep = primary_outcome,
#   group = drug_dosage,
#   typestatistics = "nonparametric",
#   pairwisecomparisons = TRUE,
#   padjustmethod = "BH"
# )

# Step 2: Subgroup analysis by study phase
# subgroup_result <- jjbarstats(
#   data = clinical_trial_data,
#   dep = primary_outcome,
#   group = drug_dosage,
#   grvar = study_phase,
#   typestatistics = "nonparametric",
#   pairwisecomparisons = TRUE
# )

Comparative Analysis Across Different Outcomes

# Compare primary and secondary outcomes
# jjbarstats(
#   data = clinical_trial_data,
#   dep = c(primary_outcome, side_effects, baseline_condition),
#   group = drug_dosage,
#   typestatistics = "robust",
#   pairwisecomparisons = FALSE
# )

Conclusion

The jjbarstats function provides a comprehensive solution for categorical data analysis in clinical and research settings. Its integration with the ggstatsplot ecosystem ensures both statistical rigor and visual appeal, making it an excellent choice for:

Clinical trial analysis
Quality improvement studies
Survey research
Diagnostic test evaluation
Healthcare outcomes research

The function’s flexibility in statistical methods, multiple comparison corrections, and visualization options makes it suitable for both exploratory and confirmatory analysis phases of research projects.

Further Resources

ggstatsplot documentation
ClinicoPath package documentation
Statistical methods references for categorical data analysis

Session Information

sessionInfo()

ClinicoPath Development Team

2025-10-09