jjtreemap: Comprehensive Treemap Visualization
ClinicoPath
2025-07-13
Source:vignettes/jjstatsplot-33-jjtreemap-comprehensive.Rmd
jjstatsplot-33-jjtreemap-comprehensive.Rmd
Introduction to jjtreemap
The jjtreemap
function is a powerful wrapper around the
treemap
and ggplot2
R packages that creates
hierarchical treemap visualizations for categorical data. Treemaps
display hierarchical data as nested rectangles, where the area of each
rectangle is proportional to a quantitative value, making them ideal for
visualizing part-to-whole relationships, portfolio compositions, market
share analysis, and budget allocations.
Key Features
- Hierarchical Visualization: Display nested categorical data structures
- Size Mapping: Rectangle areas represent quantitative values
- Color Coding: Additional categorical or quantitative dimensions through color
- Flexible Labeling: Customizable text display with size, color, and alignment options
- Performance Optimized: Enhanced caching and data preparation for faster rendering
- Publication-Ready: High-quality outputs suitable for presentations and reports
Basic Treemap Creation
Simple Category Visualization
Let’s start with a basic treemap showing market share by company:
# Create sample market share data
market_data <- data.frame(
company = factor(c("TechCorp", "DataSoft", "CloudNet", "AIWorks", "SecureIT", "WebDev")),
market_share = c(28.5, 22.3, 18.7, 12.5, 10.2, 7.8),
sector = factor(c("Software", "Software", "Cloud", "AI/ML", "Security", "Web"))
)
# Create basic treemap
result_basic <- jjtreemap(
data = market_data,
group = "company",
size = "market_share",
showLabels = TRUE,
labelSize = 6
)
print(result_basic)
Understanding Treemap Components
Group Variable (Categories)
- Defines the rectangles in the treemap
- Can be hierarchical (multiple levels)
- Each unique value becomes a separate rectangle
Styling and Customization
Border Customization
Control the appearance of rectangle borders:
# Create data with hierarchical structure
dept_budget <- data.frame(
department = factor(c("R&D", "Marketing", "Sales", "Operations", "HR", "IT")),
budget = c(45.2, 32.5, 28.7, 38.9, 12.3, 22.5),
division = factor(c("Innovation", "Growth", "Growth", "Core", "Support", "Infrastructure"))
)
# Treemap with custom borders
result_borders <- jjtreemap(
data = dept_budget,
group = "department",
size = "budget",
color = "division",
borderWidth = 1.5,
borderLevel1Width = 2,
borderLevel2Width = 0.5,
borderLevel1Color = "darkblue",
borderLevel2Color = "lightgray",
showLabels = TRUE
)
print(result_borders)
Color Palettes and Themes
# Product portfolio with color coding
product_data <- data.frame(
product = factor(c("Smartphones", "Laptops", "Tablets", "Headphones",
"Smartwatches", "Cameras", "Speakers", "Monitors")),
revenue = c(450, 380, 220, 180, 150, 120, 98, 85),
category = factor(c("Mobile", "Computing", "Mobile", "Audio",
"Wearables", "Imaging", "Audio", "Computing"))
)
# Treemap with category colors
result_colored <- jjtreemap(
data = product_data,
group = "product",
size = "revenue",
color = "category",
showLabels = TRUE,
labelSize = 6,
title = "Product Revenue by Category",
subtitle = "2023 Annual Report",
caption = "Values in millions USD"
)
print(result_colored)
Real-World Applications
Market Share Analysis
# Create realistic market share data
tech_market <- data.frame(
company = factor(c("Apple", "Samsung", "Google", "Microsoft", "Amazon",
"Meta", "Tesla", "NVIDIA", "Intel", "Oracle")),
market_cap = c(2850, 1450, 1680, 2450, 1580,
890, 780, 1100, 190, 280),
sector = factor(c("Consumer Tech", "Consumer Tech", "Internet", "Software", "E-commerce",
"Social Media", "Automotive", "Semiconductors", "Semiconductors", "Enterprise"))
)
# Market capitalization treemap
result_market <- jjtreemap(
data = tech_market,
group = "company",
size = "market_cap",
color = "sector",
showLabels = TRUE,
labelSize = 8,
title = "Tech Giants Market Capitalization",
subtitle = "By Sector Classification",
caption = "Market cap in billions USD",
aspectRatio = 1.4
)
print(result_market)
Budget Allocation Visualization
# Government budget example
gov_budget <- data.frame(
category = factor(c("Healthcare", "Education", "Defense", "Social Security",
"Infrastructure", "Science & Tech", "Environment",
"Agriculture", "Justice", "Other")),
allocation = c(28.5, 22.3, 18.7, 25.2, 12.5, 8.3, 6.7, 5.8, 4.2, 3.8),
type = factor(c("Social", "Social", "Security", "Social",
"Infrastructure", "Research", "Environment",
"Economic", "Administration", "Various"))
)
# Budget treemap with custom styling
result_budget <- jjtreemap(
data = gov_budget,
group = "category",
size = "allocation",
color = "type",
showLabels = TRUE,
labelSize = 7,
labelFontFace = "bold",
title = "Federal Budget Allocation",
subtitle = "Fiscal Year 2024",
caption = "Percentages of total budget"
)
print(result_budget)
Portfolio Composition
# Investment portfolio breakdown
portfolio <- data.frame(
asset = factor(c("US Stocks", "International Stocks", "Bonds", "Real Estate",
"Commodities", "Cash", "Crypto", "Private Equity")),
value = c(45000, 28000, 35000, 22000, 12000, 8000, 5000, 15000),
risk_level = factor(c("High", "High", "Low", "Medium",
"High", "Low", "Very High", "High"))
)
# Portfolio treemap with risk coloring
result_portfolio <- jjtreemap(
data = portfolio,
group = "asset",
size = "value",
color = "risk_level",
showLabels = TRUE,
labelSize = 8,
title = "Investment Portfolio Composition",
subtitle = "Total Value: $170,000",
caption = "Color indicates risk level"
)
print(result_portfolio)
Advanced Customization
Aspect Ratio Control
# Test different aspect ratios
sales_data <- data.frame(
region = factor(c("North", "South", "East", "West", "Central")),
sales = c(120, 95, 110, 88, 102)
)
# Wide aspect ratio
result_wide <- jjtreemap(
data = sales_data,
group = "region",
size = "sales",
aspectRatio = 2.5,
showLabels = TRUE,
title = "Wide Aspect Ratio (2.5)"
)
print(result_wide)
# Square aspect ratio
result_square <- jjtreemap(
data = sales_data,
group = "region",
size = "sales",
aspectRatio = 1,
showLabels = TRUE,
title = "Square Aspect Ratio (1.0)"
)
print(result_square)
Handling Small Values
# Data with very different scales
diverse_data <- data.frame(
category = factor(c("Giant", "Large", "Medium", "Small", "Tiny", "Microscopic")),
value = c(1000, 200, 50, 10, 2, 0.5)
)
# Treemap handles extreme differences
result_diverse <- jjtreemap(
data = diverse_data,
group = "category",
size = "value",
showLabels = TRUE,
labelSize = 4,
labelOverlap = 0.8, # Allow more overlap for small rectangles
title = "Handling Extreme Value Differences"
)
print(result_diverse)
Label Visibility Control
# Many categories - label management
many_categories <- data.frame(
item = factor(paste0("Item_", LETTERS[1:20])),
value = sort(runif(20, 10, 100), decreasing = TRUE)
)
# Control label display
result_many <- jjtreemap(
data = many_categories,
group = "item",
size = "value",
showLabels = TRUE,
labelSize = 4, # Minimum size for readability
labelOverlap = 0.3, # Less overlap tolerance
title = "Many Categories with Smart Labeling"
)
print(result_many)
Clinical and Research Applications
Clinical Trial Enrollment
# Clinical trial sites and enrollment
trial_sites <- data.frame(
site = factor(paste0("Site_", sprintf("%02d", 1:12))),
enrolled = c(125, 98, 87, 76, 72, 68, 65, 58, 52, 48, 45, 42),
region = factor(c(rep("North America", 3), rep("Europe", 3),
rep("Asia Pacific", 3), rep("Latin America", 3))),
site_type = factor(c("Academic", "Community", "Private", "Academic",
"Community", "Academic", "Private", "Community",
"Academic", "Community", "Private", "Academic"))
)
# Enrollment treemap
result_clinical <- jjtreemap(
data = trial_sites,
group = "site",
size = "enrolled",
color = "region",
showLabels = TRUE,
labelSize = 6,
title = "Clinical Trial Enrollment by Site",
subtitle = "Phase III Multi-Center Study",
caption = "Total enrolled: 866 patients"
)
print(result_clinical)
Research Funding Distribution
# Research grant distribution
research_grants <- data.frame(
department = factor(c("Oncology", "Cardiology", "Neurology", "Immunology",
"Genetics", "Infectious Disease", "Pediatrics", "Surgery")),
funding = c(12.5, 10.2, 9.8, 8.5, 7.2, 6.5, 5.8, 4.5),
grant_type = factor(c("Federal", "Federal", "Mixed", "Private",
"Federal", "Mixed", "State", "Private"))
)
# Funding treemap
result_research <- jjtreemap(
data = research_grants,
group = "department",
size = "funding",
color = "grant_type",
showLabels = TRUE,
labelSize = 7,
labelFontFace = "bold",
title = "Research Funding Distribution",
subtitle = "Academic Medical Center FY2024",
caption = "Values in millions USD"
)
print(result_research)
Data Preparation Best Practices
Aggregating Data
# Raw transaction data
raw_sales <- data.frame(
product = sample(c("A", "B", "C", "D"), 100, replace = TRUE),
region = sample(c("North", "South", "East", "West"), 100, replace = TRUE),
sales = runif(100, 10, 100)
)
# Aggregate before treemap
agg_sales <- raw_sales %>%
group_by(product, region) %>%
summarise(total_sales = sum(sales), .groups = 'drop') %>%
arrange(desc(total_sales))
# Display top aggregated data
head(agg_sales, 10)
# Create treemap from aggregated data
result_agg <- jjtreemap(
data = agg_sales,
group = "product",
size = "total_sales",
color = "region",
showLabels = TRUE,
title = "Aggregated Sales by Product"
)
print(result_agg)
Handling Negative Values
# Data with negative values (profits/losses)
profit_data <- data.frame(
division = factor(c("Electronics", "Software", "Services", "Hardware",
"Consulting", "Support")),
profit = c(25.5, 18.3, -5.2, 12.7, -2.1, 8.5)
)
# Function automatically converts negatives to small positive values
result_profit <- jjtreemap(
data = profit_data,
group = "division",
size = "profit",
showLabels = TRUE,
labelSize = 8,
title = "Division Performance",
subtitle = "Note: Negative values shown as minimal size",
caption = "Original negative values: Services (-5.2), Consulting (-2.1)"
)
print(result_profit)
Hierarchical Data Preparation
# Prepare hierarchical data structure
hierarchy_data <- data.frame(
main_category = factor(rep(c("Electronics", "Clothing", "Food"), each = 3)),
sub_category = factor(c("Phones", "Laptops", "Tablets",
"Shirts", "Pants", "Shoes",
"Fruits", "Vegetables", "Dairy")),
sales = c(150, 120, 80, 60, 70, 90, 45, 38, 52)
)
# Create treemap with hierarchy indication through colors
result_hierarchy <- jjtreemap(
data = hierarchy_data,
group = "sub_category",
size = "sales",
color = "main_category",
showLabels = TRUE,
labelSize = 6,
title = "Sales by Category and Subcategory",
subtitle = "Color indicates main category"
)
print(result_hierarchy)
Performance Optimization
Large Dataset Handling
The function includes several performance optimizations:
# Performance test with larger dataset
large_data <- data.frame(
category = factor(paste0("Category_", 1:50)),
value = runif(50, 100, 10000),
group = factor(rep(paste0("Group_", LETTERS[1:5]), each = 10))
)
# This should render efficiently due to optimizations
start_time <- Sys.time()
performance_result <- jjtreemap(
data = large_data,
group = "category",
size = "value",
color = "group",
showLabels = TRUE
)
end_time <- Sys.time()
cat("Rendering time:", difftime(end_time, start_time, units = "secs"), "seconds\n")
print(performance_result)
Optimization Features
The function implements several performance enhancements:
- Data Preparation Caching: Processed data is cached to avoid recomputation
- Option Preprocessing: Common option processing is done once and cached
- Treemap Data Caching: The treemap calculation is cached and reused
- Hash-based Change Detection: Only reprocesses when inputs change
- Efficient Memory Usage: Minimizes data copying and transformation overhead
Troubleshooting Common Issues
Label Visibility
# Small rectangles with labels
small_rect_data <- data.frame(
item = factor(c("Large", "Medium", "Small", "Tiny", "Micro")),
value = c(100, 30, 10, 3, 1)
)
# Solution 1: Adjust minimum label size
result_min_label <- jjtreemap(
data = small_rect_data,
group = "item",
size = "value",
showLabels = TRUE,
labelSize = 3, # Smaller minimum size
title = "Solution: Smaller Minimum Label Size"
)
print(result_min_label)
# Solution 2: Hide labels selectively
result_no_labels <- jjtreemap(
data = small_rect_data,
group = "item",
size = "value",
showLabels = FALSE, # Hide labels for cleaner look
title = "Solution: Hide Labels for Small Items"
)
print(result_no_labels)
Color Contrast
# Ensure good contrast between labels and backgrounds
contrast_data <- data.frame(
category = factor(c("A", "B", "C", "D")),
value = c(40, 30, 20, 10),
type = factor(c("Dark", "Dark", "Light", "Light"))
)
# Adjust label colors for contrast
result_contrast <- jjtreemap(
data = contrast_data,
group = "category",
size = "value",
color = "type",
showLabels = TRUE,
labelLevel1Color = "black", # Dark labels
labelBackground = "rgba(255,255,255,0.7)", # Semi-transparent white background
title = "Improved Label Contrast"
)
print(result_contrast)
Data Validation
# Function to validate treemap data
validate_treemap_data <- function(data, group_var, size_var) {
errors <- c()
# Check if variables exist
if (!group_var %in% names(data)) {
errors <- c(errors, "Group variable not found in data")
}
if (!size_var %in% names(data)) {
errors <- c(errors, "Size variable not found in data")
}
if (length(errors) > 0) return(errors)
# Check data types
if (!is.numeric(data[[size_var]])) {
errors <- c(errors, "Size variable must be numeric")
}
# Check for negative values
if (any(data[[size_var]] < 0, na.rm = TRUE)) {
errors <- c(errors, "Warning: Negative values will be converted to 0.01")
}
# Check for missing values
complete_rows <- sum(complete.cases(data[c(group_var, size_var)]))
if (complete_rows == 0) {
errors <- c(errors, "No complete data rows")
}
if (length(errors) == 0) {
return("Data validation passed!")
} else {
return(errors)
}
}
# Test validation
test_data <- data.frame(
category = c("A", "B", "C"),
value = c(10, 20, 30)
)
validate_treemap_data(test_data, "category", "value")
Best Practices and Recommendations
Design Guidelines
- Hierarchy Levels: Limit to 2-3 levels for clarity
- Color Usage: Use color to represent meaningful categories
- Label Density: Show labels only for significant rectangles
- Aspect Ratio: Choose based on display medium (wide for presentations, square for reports)
- Border Width: Use thicker borders for main categories
Data Preparation Tips
- Aggregate First: Always aggregate data to appropriate level
- Handle Negatives: Convert negative values or use alternative visualization
- Sort by Size: Larger values create more visually prominent rectangles
- Limit Categories: 5-15 categories work best for readability
- Use Meaningful Names: Short, descriptive category names
Visual Hierarchy
# Example of good visual hierarchy
hierarchy_example <- data.frame(
category = factor(c("Primary A", "Primary B", "Secondary C",
"Secondary D", "Minor E", "Minor F")),
value = c(350, 280, 120, 95, 45, 30),
importance = factor(c("High", "High", "Medium", "Medium", "Low", "Low"))
)
result_hierarchy <- jjtreemap(
data = hierarchy_example,
group = "category",
size = "value",
color = "importance",
showLabels = TRUE,
labelSize = 6,
borderWidth = 1.5,
title = "Visual Hierarchy in Treemap Design",
subtitle = "Size and color reinforce importance"
)
print(result_hierarchy)
Integration with Reporting Workflows
Export-Ready Plots
# High-quality treemap for reports
report_data <- data.frame(
metric = factor(c("Revenue", "Costs", "R&D", "Marketing", "Operations")),
value = c(150, 85, 25, 18, 42),
category = factor(c("Income", "Expense", "Investment", "Investment", "Expense"))
)
# Publication-quality treemap
result_publication <- jjtreemap(
data = report_data,
group = "metric",
size = "value",
color = "category",
showLabels = TRUE,
labelSize = 10,
labelFontFace = "bold",
borderWidth = 2,
title = "Financial Overview FY2024",
subtitle = "All values in millions USD",
caption = "Source: Annual Financial Report"
)
print(result_publication)
Summary
The jjtreemap
function provides a comprehensive solution
for hierarchical data visualization with:
- Flexible customization for borders, labels, and colors
- Performance optimizations for efficient rendering
- Robust error handling and data validation
- Publication-ready output quality
- Wide applicability across business, research, and clinical domains
Treemaps are particularly effective for: - Part-to-whole relationships - Portfolio composition - Budget allocation - Market share analysis - Resource distribution - Hierarchical categorical data
Function Reference
For complete parameter documentation, see the treemap and ggplot2 package documentation: - CRAN treemap documentation - ggplot2 documentation
# Session information
sessionInfo()