Advanced Distribution Visualization with Raincloud Plots
Source:vignettes/08-advancedraincloud.Rmd
08-advancedraincloud.RmdAdvanced Distribution Visualization with Raincloud Plots
A raincloud plot is a modern and informative way to visualize the distribution of data. It combines a violin plot, a box plot, and the raw data points into a single figure, providing a great deal of information at a glance.
The Clinical Scenario
A researcher is comparing the effectiveness of two different therapies. They have collected a “score” from patients in two groups: a control group and a treatment group. They want to answer the question:
Is there a difference in the distribution of scores between the control and treatment groups?
A raincloud plot is an excellent way to visualize the answer to this question.
Step 1: The Analysis in jamovi
- Load the
advancedraincloud_data.omvdataset into jamovi. - From the main analysis ribbon, click on JJStatsPlot -> Advanced -> Raincloud Plot.
[Screenshot of the jamovi analysis ribbon showing the path to the Raincloud Plot.] ***
- In the analysis window:
- Move the
scorevariable to the Dependent Variable box. - Move the
groupvariable to the Grouping Variable box.
- Move the
[Screenshot of the analysis window showing the variables being assigned.] ***
Step 2: The Output Plot
jamovi will generate the following raincloud plot:
# Load raincloud data
data("advancedraincloud_data", package = "ClinicoPath")
# Ensure the grouping variable is a factor
advancedraincloud_data$group <- factor(advancedraincloud_data$group)
advancedraincloud(
data = advancedraincloud_data,
dep = "score",
group = "group",
title = "Score Distribution by Group with Raincloud Plot"
)Step 3: Interpreting the Plot
The raincloud plot has three components:
- The “Cloud” (Violin Plot): The shaded area on the left shows the distribution of the data. The wider the cloud, the more data points there are at that score.
- The “Rain” (Data Points): The individual dots on the right are the actual data points for each patient. This helps you to see the spread of the data and identify any potential outliers.
- The “Box” (Box Plot): The box plot below the violin provides a summary of the data. The line in the middle of the box is the median, the box represents the interquartile range, and the whiskers show the range of the data.
From this plot, we can see that the treatment group appears to have higher scores than the control group. The distribution of the treatment group is shifted to the right.
Advanced Feature: Longitudinal Connections
The advanced raincloud plot has a powerful feature for visualizing paired or repeated-measures data. If you have data from the same patients at two different time points, you can connect the dots to see the individual change for each patient.
To do this, you would need your data in a long format with a patient
ID variable. You would then assign the ID variable to the
ID box in the analysis window and turn on the
Show Longitudinal option.
[Image of a raincloud plot with longitudinal connections, showing lines connecting the dots between two time points.] ***
Step 4: Reporting the Results
When reporting the results of a raincloud plot, you should describe the visual findings and supplement them with the appropriate statistical test (e.g., a t-test or ANOVA).
A raincloud plot was used to visualize the distribution of scores in the control and treatment groups. The plot revealed that the treatment group had a higher median score and a distribution shifted towards higher values compared to the control group. A two-sample t-test confirmed that this difference was statistically significant (t(98) = -5.4, p < 0.001).