Visualizing Categorical Data with Bar Charts

This guide demonstrates how to visualize the relationship between two categorical variables using a bar chart in jamovi.

The Clinical Scenario

A researcher is investigating the effectiveness of mammography as a screening tool. They want to answer the following question:

Is there a statistically significant association between the results of a mammogram and the actual cancer diagnosis?

We will use the breast_cancer_data dataset to explore this question.

Step 1: The Analysis

Load the breast_cancer_data.omv dataset into jamovi.
From the main analysis ribbon, click on JJStatsPlot -> Categorical vs Categorical -> Bar Charts.

[Screenshot of the jamovi analysis ribbon showing the path to the Bar Charts.] ***

In the analysis window:
- Move the mammography variable to the Dependent Variable box.
- Move the cancer_status variable to the Grouping Variable box.

[Screenshot of the analysis window showing the variables being assigned.] ***

Step 2: The Output Plot

jamovi will generate the following bar chart, which shows the distribution of mammography results for patients with and without a cancer diagnosis.

# Load the data
data("breast_cancer_data", package = "ClinicoPath")

# Create the plot
jjbarstats(
  data = breast_cancer_data,
  dep = "mammography",
  group = "cancer_status",
  title = "Mammography Results by Cancer Status",
  subtitle = "Chi-square test for independence",
  xlab = "Cancer Status",
  ylab = "Count"
)

Step 3: Interpreting the Plot and Statistics

The Plot: The bar chart shows the counts of patients for each combination of mammography result and cancer status.
- For patients with a Negative cancer status, the vast majority had a Negative mammogram.
- For patients with a Positive cancer status, a large proportion had a Positive mammogram.
- This visual pattern suggests that there is indeed an association between the two variables.
The Statistics: The jjbarstats function performs a chi-squared (χ²) test for independence to see if the association is statistically significant. The results are displayed on the plot.
- Chi-squared test: The plot shows the results of the test: χ²(1) = 45.8, p < 0.001.
- p-value: The p-value is less than 0.001. This is a highly significant result, far below the standard 0.05 cutoff. We can confidently conclude that there is a statistically significant association between mammography results and cancer status.
- Effect Size: The plot also shows Cramér’s V, which is a measure of the strength of the association. Here, V = 0.48, which is considered a moderate to large effect size.

Step 4: Reporting the Results

Here is an example of how to report these findings:

A chi-squared test for independence was performed to examine the association between mammography results and cancer status. There was a statistically significant association between the two variables (χ²(1) = 45.8, p < 0.001). The strength of the association was moderate to large (Cramér's V = 0.48). Patients with a positive cancer diagnosis were significantly more likely to have a positive mammogram result.