Clinical Heatmap: Advanced Visualization for Biomedical Data
ClinicoPath Development Team
2025-10-09
Source:vignettes/clinicalheatmap_comprehensive.Rmd
clinicalheatmap_comprehensive.RmdIntroduction
The Clinical Heatmap module leverages the powerful
tidyheatmaps package to create sophisticated visualizations
of multivariate clinical and biomedical data. This comprehensive guide
demonstrates how to create publication-ready heatmaps for various
clinical research applications.
Key Features: - Tidy Data Integration: Works directly with long-format clinical datasets - Clinical Annotations: Row and column annotations for patient/biomarker characteristics - Flexible Scaling: Multiple normalization methods for different data types - Advanced Clustering: Hierarchical clustering to reveal data patterns - Publication Ready: High-quality outputs with customizable aesthetics
When to Use Clinical Heatmaps
Clinical heatmaps are particularly valuable for:
- Biomarker Expression Profiling: Visualizing multi-marker panels across patient cohorts
- Genomic Data Analysis: Gene expression matrices and mutation landscapes
- Quality Control Assessment: Batch effects and instrument performance monitoring
- Treatment Response Patterns: Longitudinal measurements and therapeutic outcomes
- Precision Medicine Applications: Molecular subtyping and therapeutic target identification
Data Format Requirements
The Clinical Heatmap function expects data in tidy (long) format with three essential columns:
- Row Variable: Defines heatmap rows (e.g., patient IDs, gene names, samples)
- Column Variable: Defines heatmap columns (e.g., biomarkers, time points, treatments)
- Value Variable: Numeric values to visualize (e.g., expression levels, scores, measurements)
# Example of proper tidy format for clinical heatmaps
example_data <- data.frame(
patient_id = rep(paste0("Patient_", 1:20), each = 5),
biomarker = rep(c("ER", "PR", "HER2", "Ki67", "p53"), 20),
expression_score = rnorm(100, mean = 50, sd = 15),
tumor_stage = rep(c("I", "II", "III", "IV"), length.out = 100),
treatment = rep(c("ChemoA", "ChemoB", "Targeted"), length.out = 100)
)
head(example_data)Application 1: Biomarker Expression Profiling
Basic Biomarker Heatmap
Let’s start with a simple biomarker expression heatmap using clinical data:
# Create sample biomarker expression data
set.seed(123)
biomarker_data <- expand.grid(
patient_id = paste0("P", sprintf("%03d", 1:50)),
biomarker = c("ER", "PR", "HER2", "Ki67", "p53", "EGFR", "VEGF", "CD31")
) %>%
mutate(
expression_level = case_when(
biomarker %in% c("ER", "PR") ~ rnorm(n(), mean = 75, sd = 20),
biomarker == "HER2" ~ rnorm(n(), mean = 25, sd = 15),
biomarker == "Ki67" ~ rnorm(n(), mean = 40, sd = 25),
TRUE ~ rnorm(n(), mean = 50, sd = 20)
),
# Add clinical annotations
tumor_type = rep(c("Luminal A", "Luminal B", "HER2+", "Triple Negative", "Other"),
length.out = n()),
grade = rep(c("Grade 1", "Grade 2", "Grade 3"), length.out = n())
) %>%
# Ensure realistic expression ranges
mutate(expression_level = pmax(0, pmin(100, expression_level)))
# Basic heatmap without scaling
clinicalheatmap(
data = biomarker_data,
rowVar = "patient_id",
colVar = "biomarker",
valueVar = "expression_level",
colorPalette = "RdBu",
showDataSummary = TRUE
)Enhanced Biomarker Heatmap with Annotations
# Enhanced heatmap with clinical annotations and scaling
clinicalheatmap(
data = biomarker_data,
rowVar = "patient_id",
colVar = "biomarker",
valueVar = "expression_level",
annotationCols = c("tumor_type", "grade"),
scaleMethod = "row", # Z-score scaling within each patient
clusterRows = TRUE,
clusterCols = TRUE,
colorPalette = "viridis",
showRownames = FALSE, # Hide patient IDs for cleaner visualization
showColnames = TRUE,
showDataSummary = TRUE,
showInterpretation = TRUE
)Application 2: Genomic Data Visualization
Gene Expression Heatmap
# Create sample gene expression data
set.seed(456)
gene_data <- expand.grid(
sample_id = paste0("Sample_", sprintf("%02d", 1:30)),
gene = paste0("Gene_", LETTERS[1:15])
) %>%
mutate(
# Simulate different expression patterns
log2_expression = case_when(
gene %in% paste0("Gene_", c("A", "B", "C")) ~ rnorm(n(), mean = 8, sd = 1.5),
gene %in% paste0("Gene_", c("D", "E", "F")) ~ rnorm(n(), mean = 6, sd = 1),
gene %in% paste0("Gene_", c("G", "H", "I")) ~ rnorm(n(), mean = 4, sd = 2),
TRUE ~ rnorm(n(), mean = 5, sd = 1.5)
),
# Add sample annotations
cancer_type = rep(c("Type A", "Type B", "Type C"), length.out = n()),
mutation_status = rep(c("Wild-type", "Mutated"), length.out = n()),
treatment_response = rep(c("Responder", "Non-responder"), length.out = n())
)
# Gene expression heatmap with column scaling
clinicalheatmap(
data = gene_data,
rowVar = "sample_id",
colVar = "gene",
valueVar = "log2_expression",
annotationCols = c("cancer_type", "mutation_status", "treatment_response"),
scaleMethod = "column", # Z-score scaling within each gene
clusterRows = TRUE,
clusterCols = TRUE,
colorPalette = "plasma",
showDataSummary = TRUE
)Application 3: Quality Control Monitoring
Batch Effect Visualization
# Create sample quality control data showing batch effects
set.seed(789)
qc_data <- expand.grid(
sample_id = paste0("QC_", sprintf("%03d", 1:40)),
assay = c("Assay_1", "Assay_2", "Assay_3", "Assay_4", "Assay_5", "Assay_6")
) %>%
mutate(
batch = rep(paste0("Batch_", 1:4), length.out = n()),
# Simulate batch effects
measurement = case_when(
batch == "Batch_1" ~ rnorm(n(), mean = 100, sd = 10),
batch == "Batch_2" ~ rnorm(n(), mean = 105, sd = 12),
batch == "Batch_3" ~ rnorm(n(), mean = 95, sd = 8),
batch == "Batch_4" ~ rnorm(n(), mean = 102, sd = 15)
),
instrument = rep(c("Instrument_A", "Instrument_B"), length.out = n()),
technician = rep(c("Tech_1", "Tech_2", "Tech_3"), length.out = n())
)
# QC heatmap to identify batch effects
clinicalheatmap(
data = qc_data,
rowVar = "sample_id",
colVar = "assay",
valueVar = "measurement",
annotationCols = c("batch", "instrument", "technician"),
scaleMethod = "column", # Standardize each assay
clusterRows = TRUE,
clusterCols = FALSE, # Don't cluster assays to maintain order
colorPalette = "RdYlBu",
showDataSummary = TRUE,
showInterpretation = TRUE
)Application 4: Treatment Response Analysis
Longitudinal Treatment Response
# Create longitudinal treatment response data
set.seed(101112)
response_data <- expand.grid(
patient_id = paste0("PT_", sprintf("%02d", 1:25)),
timepoint = c("Baseline", "Week_4", "Week_8", "Week_12", "Week_24")
) %>%
mutate(
# Simulate different response patterns
response_score = case_when(
timepoint == "Baseline" ~ rnorm(n(), mean = 100, sd = 15),
timepoint == "Week_4" ~ rnorm(n(), mean = 85, sd = 20),
timepoint == "Week_8" ~ rnorm(n(), mean = 70, sd = 25),
timepoint == "Week_12" ~ rnorm(n(), mean = 60, sd = 30),
timepoint == "Week_24" ~ rnorm(n(), mean = 50, sd = 35)
),
# Add patient characteristics
treatment_arm = rep(c("Treatment_A", "Treatment_B", "Placebo"), length.out = n()),
baseline_severity = rep(c("Mild", "Moderate", "Severe"), length.out = n()),
age_group = rep(c("Young", "Middle", "Elderly"), length.out = n())
) %>%
# Ensure realistic score ranges
mutate(response_score = pmax(0, pmin(150, response_score)))
# Treatment response heatmap
clinicalheatmap(
data = response_data,
rowVar = "patient_id",
colVar = "timepoint",
valueVar = "response_score",
annotationCols = c("treatment_arm", "baseline_severity", "age_group"),
scaleMethod = "row", # Show change from baseline for each patient
clusterRows = TRUE,
clusterCols = FALSE, # Maintain temporal order
colorPalette = "inferno",
showDataSummary = TRUE,
showInterpretation = TRUE
)Advanced Features
Missing Data Handling
# Create data with missing values
missing_data <- biomarker_data %>%
# Introduce random missing values
mutate(
expression_level = ifelse(runif(n()) < 0.15, NA, expression_level)
)
# Heatmap with different missing data strategies
clinicalheatmap(
data = missing_data,
rowVar = "patient_id",
colVar = "biomarker",
valueVar = "expression_level",
naHandling = "median", # Replace with median values
scaleMethod = "column",
colorPalette = "Blues",
showDataSummary = TRUE
)Custom Export Settings
# Heatmap optimized for publication
clinicalheatmap(
data = biomarker_data,
rowVar = "patient_id",
colVar = "biomarker",
valueVar = "expression_level",
annotationCols = "tumor_type",
scaleMethod = "row",
clusterRows = TRUE,
clusterCols = TRUE,
colorPalette = "RdBu",
showRownames = FALSE,
showColnames = TRUE,
exportWidth = 12, # Wider for publication
exportHeight = 8, # Taller for better readability
showDataSummary = FALSE, # Clean output for publication
showInterpretation = FALSE
)Interpretation Guidelines
Understanding Heatmap Patterns
When interpreting clinical heatmaps, consider:
1. Color Intensity
- High intensity: Strong signal or high expression
- Low intensity: Weak signal or low expression
- Scale-dependent: Interpretation changes based on scaling method
2. Clustering Patterns
- Row clusters: Groups of patients/samples with similar profiles
- Column clusters: Related biomarkers or measurements
- Block patterns: Coordinated regulation or shared biology
Best Practices
Data Preparation
- Quality Control: Remove low-quality samples and unreliable measurements
- Normalization: Apply appropriate scaling based on data type and research question
- Annotation: Include relevant clinical and technical metadata
- Documentation: Record data processing steps for reproducibility
Visualization Design
- Color Choice: Use colorblind-friendly palettes for accessibility
- Scale Selection: Choose scaling method appropriate for your research question
- Clustering: Consider whether hierarchical clustering adds meaningful information
- Annotation: Balance information content with visual clarity
Statistical Considerations
- Multiple Testing: Consider correction for multiple comparisons if testing hypotheses
- Effect Size: Focus on clinically meaningful differences, not just statistical significance
- Validation: Confirm patterns in independent datasets when possible
- Interpretation: Remember that heatmaps show associations, not causation
Clinical Applications Summary
The Clinical Heatmap module is particularly powerful for:
- Precision Medicine: Identifying molecular subtypes and therapeutic targets
- Clinical Trials: Visualizing treatment response patterns and biomarker changes
- Diagnostic Development: Profiling biomarker panels for disease classification
- Quality Assurance: Monitoring laboratory performance and identifying batch effects
- Research Publication: Creating publication-ready visualizations of complex datasets
Citation
When using the Clinical Heatmap module in publications, please cite:
ClinicoPath Clinical Heatmap module, powered by tidyheatmaps package for advanced biomedical data visualization. Available at: https://github.com/sbalci/ClinicoPathJamoviModule
For the underlying tidyheatmaps package, please also cite the original package documentation.