Visualizing Categorical Data with Bar Charts
Source:vignettes/02-categorical-plots.Rmd
02-categorical-plots.RmdVisualizing Categorical Data with Bar Charts
This guide demonstrates how to visualize the relationship between two categorical variables using a bar chart in jamovi.
The Clinical Scenario
A researcher is investigating the effectiveness of mammography as a screening tool. They want to answer the following question:
Is there a statistically significant association between the results of a mammogram and the actual cancer diagnosis?
We will use the breast_cancer_data dataset to explore
this question.
Step 1: The Analysis
- Load the
breast_cancer_data.omvdataset into jamovi. - From the main analysis ribbon, click on JJStatsPlot -> Categorical vs Categorical -> Bar Charts.
[Screenshot of the jamovi analysis ribbon showing the path to the Bar Charts.] ***
- In the analysis window:
- Move the
mammographyvariable to the Dependent Variable box. - Move the
cancer_statusvariable to the Grouping Variable box.
- Move the
[Screenshot of the analysis window showing the variables being assigned.] ***
Step 2: The Output Plot
jamovi will generate the following bar chart, which shows the distribution of mammography results for patients with and without a cancer diagnosis.
# Load the data
data("breast_cancer_data", package = "ClinicoPath")
# Create the plot
jjbarstats(
data = breast_cancer_data,
dep = "mammography",
group = "cancer_status",
title = "Mammography Results by Cancer Status",
subtitle = "Chi-square test for independence",
xlab = "Cancer Status",
ylab = "Count"
)Step 3: Interpreting the Plot and Statistics
-
The Plot: The bar chart shows the counts of
patients for each combination of mammography result and cancer status.
- For patients with a Negative cancer status, the vast majority had a Negative mammogram.
- For patients with a Positive cancer status, a large proportion had a Positive mammogram.
- This visual pattern suggests that there is indeed an association between the two variables.
-
The Statistics: The
jjbarstatsfunction performs a chi-squared (χ²) test for independence to see if the association is statistically significant. The results are displayed on the plot.- Chi-squared test: The plot shows the results of the test: χ²(1) = 45.8, p < 0.001.
- p-value: The p-value is less than 0.001. This is a highly significant result, far below the standard 0.05 cutoff. We can confidently conclude that there is a statistically significant association between mammography results and cancer status.
- Effect Size: The plot also shows Cramér’s V, which is a measure of the strength of the association. Here, V = 0.48, which is considered a moderate to large effect size.
Step 4: Reporting the Results
Here is an example of how to report these findings:
A chi-squared test for independence was performed to examine the association between mammography results and cancer status. There was a statistically significant association between the two variables (χ²(1) = 45.8, p < 0.001). The strength of the association was moderate to large (Cramér's V = 0.48). Patients with a positive cancer diagnosis were significantly more likely to have a positive mammogram result.