jjstatsplot: Statistical Visualization with Oncology Data
ClinicoPath Team
2025-10-09
Source:vignettes/jjstatsplot-01-oncodatasets-examples.Rmd
jjstatsplot-01-oncodatasets-examples.Rmd
Introduction
This vignette demonstrates the jjstatsplot module’s visualization
capabilities using oncology datasets from the OncoDataSets
package. We’ll create publication-ready statistical plots that combine
beautiful visualizations with rigorous statistical testing.
Loading Required Packages
library(jjstatsplot)
#> Warning: replacing previous import 'dplyr::as_data_frame' by
#> 'igraph::as_data_frame' when loading 'jjstatsplot'
#> Warning: replacing previous import 'dplyr::groups' by 'igraph::groups' when
#> loading 'jjstatsplot'
#> Warning: replacing previous import 'dplyr::union' by 'igraph::union' when
#> loading 'jjstatsplot'
#> Warning: replacing previous import 'dplyr::select' by 'jmvcore::select' when
#> loading 'jjstatsplot'
#> Warning: replacing previous import 'magrittr::set_names' by 'purrr::set_names'
#> when loading 'jjstatsplot'
#> Warning: replacing previous import 'igraph::simplify' by 'purrr::simplify' when
#> loading 'jjstatsplot'
#> Warning: replacing previous import 'igraph::compose' by 'purrr::compose' when
#> loading 'jjstatsplot'
#> Warning: replacing previous import 'purrr::%@%' by 'rlang::%@%' when loading
#> 'jjstatsplot'
#> Warning: replacing previous import 'purrr::flatten_lgl' by 'rlang::flatten_lgl'
#> when loading 'jjstatsplot'
#> Warning: replacing previous import 'purrr::splice' by 'rlang::splice' when
#> loading 'jjstatsplot'
#> Warning: replacing previous import 'purrr::flatten_chr' by 'rlang::flatten_chr'
#> when loading 'jjstatsplot'
#> Warning: replacing previous import 'purrr::flatten_raw' by 'rlang::flatten_raw'
#> when loading 'jjstatsplot'
#> Warning: replacing previous import 'igraph::is_named' by 'rlang::is_named' when
#> loading 'jjstatsplot'
#> Warning: replacing previous import 'purrr::flatten' by 'rlang::flatten' when
#> loading 'jjstatsplot'
#> Warning: replacing previous import 'purrr::flatten_dbl' by 'rlang::flatten_dbl'
#> when loading 'jjstatsplot'
#> Warning: replacing previous import 'purrr::invoke' by 'rlang::invoke' when
#> loading 'jjstatsplot'
#> Warning: replacing previous import 'purrr::flatten_int' by 'rlang::flatten_int'
#> when loading 'jjstatsplot'
#> Warning: replacing previous import 'purrr::discard' by 'scales::discard' when
#> loading 'jjstatsplot'
#> Warning: replacing previous import 'igraph::crossing' by 'tidyr::crossing' when
#> loading 'jjstatsplot'
#> Warning: replacing previous import 'magrittr::extract' by 'tidyr::extract' when
#> loading 'jjstatsplot'
#> Warning: replacing previous import 'jmvcore::select' by 'dplyr::select' when
#> loading 'jjstatsplot'
library(OncoDataSets)
library(dplyr)
library(ggplot2)
Example 1: Melanoma - Comparing Thickness by Ulceration
data("Melanoma_df")
# Prepare data
Melanoma_df$ulcer_status <- factor(Melanoma_df$ulcer,
levels = c(0, 1),
labels = c("No Ulceration", "Ulceration"))
# Log transform thickness for better visualization
Melanoma_df$log_thickness <- log(Melanoma_df$thickness + 0.1)
Box-Violin Plot with Statistics
# In jamovi:
# 1. Select Analyses → jjstatsplot → Box-Violin Plot
# 2. Set 'log_thickness' as dependent variable
# 3. Set 'ulcer_status' as grouping variable
# 4. Options:
# - Type: "parametric" (t-test) or "nonparametric" (Mann-Whitney)
# - Add mean point
# - Show sample sizes
# - Pairwise comparisons: Games-Howell
# - Effect size: Cohen's d
# 5. Customization:
# - Title: "Tumor Thickness by Ulceration Status"
# - X-axis: "Ulceration Status"
# - Y-axis: "Log Tumor Thickness (mm)"
Example 2: Breast Cancer Wisconsin - Multiple Group Comparisons
data("BreastCancerWI_df")
# Create radius categories for visualization
BreastCancerWI_df$diagnosis_label <- factor(BreastCancerWI_df$diagnosis,
levels = c("0", "1"),
labels = c("Benign", "Malignant"))
# Select key measurements
bc_subset <- BreastCancerWI_df %>%
select(diagnosis_label, radius_mean, texture_mean, perimeter_mean, area_mean)
Correlation Matrix
# In jamovi:
# 1. Select Analyses → jjstatsplot → Correlation Matrix
# 2. Add continuous variables:
# - radius_mean
# - texture_mean
# - perimeter_mean
# - area_mean
# 3. Options:
# - Correlation type: Pearson or Spearman
# - Show significance stars
# - Adjust p-values: Holm
# 4. Split by diagnosis for separate matrices
Scatter Plot with Marginal Distributions
# In jamovi:
# 1. Select Analyses → jjstatsplot → Scatter Plot
# 2. X-axis: radius_mean
# 3. Y-axis: texture_mean
# 4. Group by: diagnosis_label
# 5. Options:
# - Add regression lines per group
# - Show confidence intervals
# - Marginal plots: density
# - Correlation coefficient per group
Example 3: Smoking and Lung Cancer - Categorical Analysis
data("SmokingLungCancer_df")
# Examine structure
str(SmokingLungCancer_df)
#> 'data.frame': 63 obs. of 4 variables:
#> $ yrs_smk : Factor w/ 9 levels "15-19","20-24",..: 1 2 3 4 5 6 7 8 9 1 ...
#> $ pys : num 10366 8162 5969 4496 3512 ...
#> $ num_cigs: Factor w/ 7 levels "0","1-9","10-14",..: 1 1 1 1 1 1 1 1 1 2 ...
#> $ deaths : num 1 0 0 0 0 0 0 0 2 0 ...
Pie/Donut Chart with Chi-square Test
# In jamovi:
# 1. Select Analyses → jjstatsplot → Pie Chart
# 2. Main variable: cancer_status
# 3. Grouping: smoking_status
# 4. Options:
# - Display: "donut" (modern) or "pie" (classic)
# - Show percentages
# - Chi-square test results
# - Paired comparisons with adjusted p-values
Contingency Table Visualization
# In jamovi:
# 1. Select Analyses → jjstatsplot → Contingency Table Plot
# 2. Rows: smoking_status
# 3. Columns: cancer_status
# 4. Statistics:
# - Chi-square test
# - Cramer's V effect size
# - Standardized residuals
# 5. Visual options:
# - Mosaic plot
# - Association plot
# - Pearson residuals shading
Example 4: Survival Data Visualization
Example 5: Paired Data - Before/After Treatment
# Simulate paired data (before/after treatment)
set.seed(123)
paired_data <- data.frame(
patient_id = 1:30,
before = rnorm(30, mean = 10, sd = 2),
after = rnorm(30, mean = 8, sd = 2)
)
Paired Box-Violin Plot
# In jamovi:
# 1. Select Analyses → jjstatsplot → Paired Box-Violin
# 2. Set data in long format with:
# - Measurement variable
# - Time variable (before/after)
# - ID variable (patient_id)
# 3. Statistics:
# - Paired t-test or Wilcoxon
# - Effect size: Cohen's d
# - Individual change lines
Example 6: Multi-Panel Visualizations
Histogram Matrix by Groups
# Using Melanoma data:
# 1. Select Analyses → jjstatsplot → Histogram
# 2. Variable: age
# 3. Grouping: status (survival outcome)
# 4. Options:
# - Bin width: automatic or manual
# - Overlay: density curve
# - Test normality per group
# - Show group statistics
Advanced Customization
# All jjstatsplot visualizations support:
# 1. Themes:
# - ggplot2 themes (minimal, classic, dark)
# - Custom color palettes
# - Font adjustments
# 2. Annotations:
# - Custom titles and subtitles
# - Caption with statistical results
# - Sample size annotations
# 3. Export options:
# - High-resolution PNG/PDF
# - Customizable dimensions
# - Publication-ready formatting
Example 7: Complex Relationships
data("BrainCancerCases_df")
str(BrainCancerCases_df)
#> 'data.frame': 1175 obs. of 5 variables:
#> $ county : Factor w/ 31 levels "Bernalillo","Catron",..: 9 23 1 7 30 22 31 7 31 1 ...
#> $ cases : int 1 1 1 1 1 1 1 1 1 1 ...
#> $ year : int 1977 1974 1977 1977 1977 1977 1977 1977 1977 1975 ...
#> $ agegroup: int 2 8 13 14 16 11 17 11 7 13 ...
#> $ sex : int 2 2 1 2 2 1 1 2 2 1 ...
Matrix Scatter Plot
# In jamovi:
# 1. Select Analyses → jjstatsplot → Scatter Matrix
# 2. Add multiple continuous variables
# 3. Group by categorical variable
# 4. Options:
# - Lower triangle: scatter plots
# - Diagonal: density plots
# - Upper triangle: correlation coefficients
# - Smooth lines with confidence bands
Best Practices for Statistical Visualization
1. Choose Appropriate Plot Types
- Continuous vs. Categorical: Box plots, violin plots
- Two Continuous: Scatter plots with regression
- Multiple Groups: Faceted plots
- Correlations: Heatmaps with significance
2. Statistical Rigor
- Always show confidence intervals
- Report exact p-values
- Include effect sizes
- Adjust for multiple comparisons
Integration with Research Workflow
Publication Pipeline
# 1. Exploratory Analysis
# - Use jjstatsplot for initial visualization
# - Identify patterns and outliers
#
# 2. Statistical Testing
# - Integrated tests in all plots
# - Effect sizes automatically calculated
#
# 3. Publication Preparation
# - Export high-quality figures
# - Extract statistical results
# - Create supplementary materials
#
# 4. Presentation
# - Use consistent themes
# - Create plot series
# - Generate summary slides
Advanced Features
Conclusion
The jjstatsplot module transforms statistical analysis into publication-ready visualizations by:
- Combining beautiful plots with rigorous statistics
- Providing appropriate tests for each data type
- Including effect sizes and confidence intervals
- Supporting complex multi-group comparisons
- Offering extensive customization options
Used with OncoDataSets
, researchers can quickly create
compelling visualizations that effectively communicate their findings
while maintaining statistical integrity.
References
- OncoDataSets package: https://cran.r-project.org/package=OncoDataSets
- jjstatsplot documentation: https://www.serdarbalci.com/jjstatsplot/
- ggstatsplot (underlying package): https://indrajeetpatil.github.io/ggstatsplot/