Hull Plot Visualization for Group Analysis
ClinicoPath
2025-07-13
Source:vignettes/clinicopath-descriptives-23-hullplot-comprehensive.Rmd
clinicopath-descriptives-23-hullplot-comprehensive.Rmd
Introduction
The hullplot
function in ClinicoPath provides powerful
cluster visualization capabilities using hull polygons to show group
boundaries in scatter plots. This function is built on the ggforce
package and is perfect for identifying customer segments, patient
subgroups, experimental conditions, and quality control clusters.
Key Features
- Hull Polygons: Create convex or concave boundaries around grouped data points
- Multiple Variables: Support for X/Y axes, grouping, coloring, and sizing variables
- Customizable Appearance: Multiple color palettes, themes, and styling options
- Statistical Analysis: Group statistics, outlier detection, and confidence ellipses
- Professional Output: Publication-ready plots for research and presentations
Loading Test Datasets
ClinicoPath provides comprehensive test datasets specifically designed for hull plot visualization:
# Load all hullplot test datasets
data(hullplot_customer_data, package = "ClinicoPath")
data(hullplot_clinical_data, package = "ClinicoPath")
data(hullplot_experimental_data, package = "ClinicoPath")
data(hullplot_survey_data, package = "ClinicoPath")
data(hullplot_quality_data, package = "ClinicoPath")
# Display dataset dimensions
cat("Dataset Dimensions:\n")
## Dataset Dimensions:
## Customer Data: 500 × 10
## Clinical Data: 400 × 11
cat("Experimental Data:", nrow(hullplot_experimental_data), "×", ncol(hullplot_experimental_data), "\n")
## Experimental Data: 350 × 11
## Survey Data: 450 × 11
## Quality Data: 300 × 12
Basic Hull Plots
Simple Hull Plot
Let’s start with a basic hull plot showing customer segmentation:
# Create a basic hull plot
result1 <- ClinicoPath::hullplot(
data = hullplot_customer_data,
x_var = "annual_spending",
y_var = "purchase_frequency",
group_var = "segment"
)
# Display basic information about the plot
cat("Basic Hull Plot Created Successfully\n")
## Basic Hull Plot Created Successfully
## Customer segments: 4
## Data points: 500
Enhanced Hull Plot with Color Mapping
Add color mapping for better visualization:
# Create an enhanced hull plot with color mapping
result2 <- ClinicoPath::hullplot(
data = hullplot_customer_data,
x_var = "annual_spending",
y_var = "purchase_frequency",
group_var = "segment",
color_var = "preferred_category",
size_var = "satisfaction",
hull_alpha = 0.3,
show_labels = TRUE,
plot_title = "Customer Segmentation Analysis",
x_label = "Annual Spending ($)",
y_label = "Purchase Frequency (per year)"
)
cat("Enhanced Hull Plot Created\n")
## Enhanced Hull Plot Created
cat("Color mapping by preferred category\n")
## Color mapping by preferred category
cat("Point size represents customer satisfaction\n")
## Point size represents customer satisfaction
Clinical Data Visualization
Biomarker Clustering
Hull plots are excellent for visualizing patient subgroups based on biomarker data:
# Create clinical biomarker hull plot
result3 <- ClinicoPath::hullplot(
data = hullplot_clinical_data,
x_var = "biomarker_x",
y_var = "biomarker_y",
group_var = "disease_subtype",
color_var = "treatment_response",
show_statistics = TRUE,
outlier_detection = TRUE,
confidence_ellipses = TRUE,
color_palette = "clinical",
plot_title = "Patient Biomarker Clustering",
x_label = "Biomarker X (ng/mL)",
y_label = "Biomarker Y (units/L)"
)
cat("Clinical Hull Plot Created\n")
## Clinical Hull Plot Created
## Disease subtypes: 4
cat("Includes statistics and outlier detection\n")
## Includes statistics and outlier detection
Advanced Clinical Analysis
# Advanced clinical analysis with multiple features
result4 <- ClinicoPath::hullplot(
data = hullplot_clinical_data,
x_var = "biomarker_x",
y_var = "biomarker_y",
group_var = "disease_subtype",
color_var = "hospital",
size_var = "severity_score",
hull_concavity = 1.5,
hull_alpha = 0.4,
show_labels = TRUE,
show_statistics = TRUE,
color_palette = "viridis",
plot_theme = "clinical",
plot_title = "Multi-Hospital Biomarker Analysis",
x_label = "Primary Biomarker",
y_label = "Secondary Biomarker"
)
cat("Advanced Clinical Analysis Created\n")
## Advanced Clinical Analysis Created
cat("Multi-dimensional visualization with hospital comparison\n")
## Multi-dimensional visualization with hospital comparison
Experimental Data Analysis
Treatment Response Visualization
Visualize experimental treatment responses with hull plots:
# Create experimental response hull plot
result5 <- ClinicoPath::hullplot(
data = hullplot_experimental_data,
x_var = "response_x",
y_var = "response_y",
group_var = "condition",
color_var = "timepoint",
size_var = "dose_level",
confidence_ellipses = TRUE,
hull_alpha = 0.25,
color_palette = "set1",
plot_title = "Treatment Response Analysis",
x_label = "Primary Response Measure",
y_label = "Secondary Response Measure"
)
cat("Experimental Hull Plot Created\n")
## Experimental Hull Plot Created
## Treatment conditions: 4
## Time points analyzed: 4
Quality Control Visualization
# Quality control analysis
result6 <- ClinicoPath::hullplot(
data = hullplot_experimental_data,
x_var = "response_x",
y_var = "response_y",
group_var = "batch",
color_var = "condition",
outlier_detection = TRUE,
show_statistics = TRUE,
color_palette = "dark2",
plot_title = "Experimental Batch Quality Control",
x_label = "Response Variable 1",
y_label = "Response Variable 2"
)
cat("Quality Control Analysis Created\n")
## Quality Control Analysis Created
cat("Batch variation assessment with outlier detection\n")
## Batch variation assessment with outlier detection
Survey Data Analysis
Political Orientation Mapping
Hull plots can effectively visualize survey data and political orientations:
# Create political orientation hull plot
result7 <- ClinicoPath::hullplot(
data = hullplot_survey_data,
x_var = "economic_conservatism",
y_var = "social_conservatism",
group_var = "political_orientation",
color_var = "region",
size_var = "institutional_trust",
hull_concavity = 1.8,
show_labels = TRUE,
color_palette = "viridis",
plot_title = "Political Orientation by Geographic Region",
x_label = "Economic Conservatism (0-100)",
y_label = "Social Conservatism (0-100)"
)
cat("Political Survey Hull Plot Created\n")
## Political Survey Hull Plot Created
## Political orientations: 4
## Geographic regions: 3
Demographic Analysis
# Demographic clustering analysis
result8 <- ClinicoPath::hullplot(
data = hullplot_survey_data,
x_var = "economic_conservatism",
y_var = "social_conservatism",
group_var = "education",
color_var = "income_level",
show_statistics = TRUE,
hull_alpha = 0.35,
color_palette = "set2",
plot_theme = "minimal",
plot_title = "Political Views by Education and Income",
x_label = "Economic Conservatism",
y_label = "Social Conservatism"
)
cat("Demographic Analysis Created\n")
## Demographic Analysis Created
cat("Education levels analyzed with income stratification\n")
## Education levels analyzed with income stratification
Manufacturing Quality Control
Quality Group Visualization
Hull plots are valuable for quality control and manufacturing analysis:
# Create quality control hull plot
result9 <- ClinicoPath::hullplot(
data = hullplot_quality_data,
x_var = "dimensional_accuracy",
y_var = "surface_finish",
group_var = "quality_group",
color_var = "material_grade",
show_statistics = TRUE,
outlier_detection = TRUE,
color_palette = "clinical",
plot_title = "Manufacturing Quality Control Analysis",
x_label = "Dimensional Accuracy (%)",
y_label = "Surface Finish Quality (%)"
)
cat("Quality Control Hull Plot Created\n")
## Quality Control Hull Plot Created
## Quality groups: 4
## Material grades: 3
Production Line Analysis
# Production line comparison
result10 <- ClinicoPath::hullplot(
data = hullplot_quality_data,
x_var = "dimensional_accuracy",
y_var = "surface_finish",
group_var = "production_line",
color_var = "shift",
size_var = "defect_count",
hull_alpha = 0.3,
confidence_ellipses = TRUE,
color_palette = "dark2",
plot_title = "Production Line Performance by Shift",
x_label = "Dimensional Accuracy",
y_label = "Surface Finish"
)
cat("Production Line Analysis Created\n")
## Production Line Analysis Created
cat("Production shift comparison with defect tracking\n")
## Production shift comparison with defect tracking
Customization Options
Hull Appearance Customization
# Demonstrate hull appearance options
result11 <- ClinicoPath::hullplot(
data = hullplot_customer_data,
x_var = "annual_spending",
y_var = "purchase_frequency",
group_var = "segment",
hull_concavity = 0.5, # More concave hulls
hull_alpha = 0.6, # More opaque
hull_expand = 0.15, # Larger hulls
point_size = 3, # Larger points
point_alpha = 0.8, # More opaque points
show_labels = TRUE,
plot_title = "Customized Hull Appearance"
)
cat("Hull Customization Demonstrated\n")
## Hull Customization Demonstrated
cat("Features: concave hulls, custom transparency, expanded boundaries\n")
## Features: concave hulls, custom transparency, expanded boundaries
Color Palette Comparison
# Demonstrate different color palettes
palettes_to_test <- c("viridis", "clinical", "set1", "set2", "dark2")
for (palette in palettes_to_test[1:3]) { # Test first 3 for brevity
result <- ClinicoPath::hullplot(
data = hullplot_clinical_data,
x_var = "biomarker_x",
y_var = "biomarker_y",
group_var = "disease_subtype",
color_palette = palette,
plot_title = paste("Color Palette:", palette)
)
cat("Tested palette:", palette, "\n")
}
## Tested palette: viridis
## Tested palette: clinical
## Tested palette: set1
cat("Color palette comparison completed\n")
## Color palette comparison completed
Theme Variations
# Demonstrate different plot themes
themes_to_test <- c("minimal", "classic", "clinical")
for (theme in themes_to_test) {
result <- ClinicoPath::hullplot(
data = hullplot_survey_data,
x_var = "economic_conservatism",
y_var = "social_conservatism",
group_var = "political_orientation",
plot_theme = theme,
plot_title = paste("Theme:", theme)
)
cat("Applied theme:", theme, "\n")
}
## Applied theme: minimal
## Applied theme: classic
## Applied theme: clinical
cat("Theme variations demonstrated\n")
## Theme variations demonstrated
Statistical Features
Group Statistics Analysis
# Comprehensive group statistics
result12 <- ClinicoPath::hullplot(
data = hullplot_clinical_data,
x_var = "biomarker_x",
y_var = "biomarker_y",
group_var = "disease_subtype",
show_statistics = TRUE,
plot_title = "Biomarker Analysis with Group Statistics"
)
cat("Group Statistics Analysis Created\n")
## Group Statistics Analysis Created
cat("Includes mean, SD, and sample sizes for each group\n")
## Includes mean, SD, and sample sizes for each group
Outlier Detection
# Outlier detection analysis
result13 <- ClinicoPath::hullplot(
data = hullplot_quality_data,
x_var = "dimensional_accuracy",
y_var = "surface_finish",
group_var = "quality_group",
outlier_detection = TRUE,
plot_title = "Quality Control with Outlier Detection"
)
cat("Outlier Detection Analysis Created\n")
## Outlier Detection Analysis Created
cat("Uses IQR method to identify potential outliers within groups\n")
## Uses IQR method to identify potential outliers within groups
Confidence Ellipses
# Confidence ellipses analysis
result14 <- ClinicoPath::hullplot(
data = hullplot_experimental_data,
x_var = "response_x",
y_var = "response_y",
group_var = "condition",
confidence_ellipses = TRUE,
hull_alpha = 0.2, # Lower alpha to see ellipses better
plot_title = "Treatment Effects with Confidence Ellipses"
)
cat("Confidence Ellipses Analysis Created\n")
## Confidence Ellipses Analysis Created
cat("95% confidence ellipses show statistical boundaries\n")
## 95% confidence ellipses show statistical boundaries
Advanced Applications
Multi-Variable Analysis
# Complex multi-variable analysis
result15 <- ClinicoPath::hullplot(
data = hullplot_clinical_data,
x_var = "biomarker_x",
y_var = "biomarker_y",
group_var = "disease_subtype",
color_var = "treatment_response",
size_var = "severity_score",
show_statistics = TRUE,
outlier_detection = TRUE,
confidence_ellipses = TRUE,
hull_concavity = 1.8,
hull_alpha = 0.3,
color_palette = "viridis",
plot_theme = "clinical",
plot_title = "Comprehensive Biomarker Analysis",
x_label = "Primary Biomarker (ng/mL)",
y_label = "Secondary Biomarker (units/L)"
)
cat("Multi-Variable Analysis Created\n")
## Multi-Variable Analysis Created
cat("Integrates grouping, coloring, sizing, statistics, and outliers\n")
## Integrates grouping, coloring, sizing, statistics, and outliers
Time Series Clustering
# Time-based experimental analysis
result16 <- ClinicoPath::hullplot(
data = hullplot_experimental_data,
x_var = "response_x",
y_var = "response_y",
group_var = "timepoint",
color_var = "condition",
size_var = "dose_level",
hull_alpha = 0.25,
show_labels = TRUE,
color_palette = "set1",
plot_title = "Treatment Response Over Time",
x_label = "Primary Response",
y_label = "Secondary Response"
)
cat("Time Series Clustering Created\n")
## Time Series Clustering Created
cat("Shows treatment evolution across timepoints\n")
## Shows treatment evolution across timepoints
Real-World Applications
Customer Segmentation for Marketing
# Professional customer segmentation
result17 <- ClinicoPath::hullplot(
data = hullplot_customer_data,
x_var = "annual_spending",
y_var = "purchase_frequency",
group_var = "segment",
color_var = "loyalty_member",
size_var = "satisfaction",
show_statistics = TRUE,
hull_alpha = 0.3,
color_palette = "clinical",
plot_theme = "minimal",
plot_title = "Customer Segmentation Strategy",
x_label = "Annual Spending ($USD)",
y_label = "Purchase Frequency (per year)"
)
cat("Marketing Application Created\n")
## Marketing Application Created
cat("Professional customer segmentation for business strategy\n")
## Professional customer segmentation for business strategy
Clinical Research Publication
# Publication-ready clinical plot
result18 <- ClinicoPath::hullplot(
data = hullplot_clinical_data,
x_var = "biomarker_x",
y_var = "biomarker_y",
group_var = "disease_subtype",
color_var = "treatment_response",
show_statistics = TRUE,
confidence_ellipses = TRUE,
hull_alpha = 0.25,
color_palette = "clinical",
plot_theme = "clinical",
plot_title = "Biomarker Expression in Disease Subtypes",
x_label = "Biomarker A Expression (ng/mL)",
y_label = "Biomarker B Expression (units/L)"
)
cat("Clinical Publication Plot Created\n")
## Clinical Publication Plot Created
cat("Publication-ready format for medical journals\n")
## Publication-ready format for medical journals
Quality Assurance Dashboard
# Quality assurance visualization
result19 <- ClinicoPath::hullplot(
data = hullplot_quality_data,
x_var = "dimensional_accuracy",
y_var = "surface_finish",
group_var = "quality_group",
color_var = "production_line",
outlier_detection = TRUE,
show_statistics = TRUE,
hull_alpha = 0.4,
color_palette = "dark2",
plot_title = "Production Quality Dashboard",
x_label = "Dimensional Accuracy (%)",
y_label = "Surface Finish Quality (%)"
)
cat("Quality Assurance Dashboard Created\n")
## Quality Assurance Dashboard Created
cat("Real-time quality monitoring visualization\n")
## Real-time quality monitoring visualization
Best Practices and Tips
Variable Selection Guidelines
- X and Y Variables: Choose continuous variables that represent meaningful dimensions
- Grouping Variable: Use categorical variables with 2-8 groups for best visualization
- Color Variable: Optional, but useful for additional categorical information
- Size Variable: Best with continuous variables representing importance or magnitude
Hull Customization
- Concavity: Lower values (0-1) create more detailed, concave hulls; higher values (1-2) create simpler shapes
- Alpha: 0.2-0.4 typically works well for overlapping groups
- Expansion: 0.05-0.1 provides good boundary padding without excessive whitespace
Color and Theme Selection
- Clinical Data: Use “clinical” or “viridis” palettes for professional appearance
- Business Data: “set1” or “set2” palettes work well for presentations
- Publications: “clinical” theme with “viridis” palette recommended
# Demonstrate best practices
result20 <- ClinicoPath::hullplot(
data = hullplot_clinical_data,
x_var = "biomarker_x",
y_var = "biomarker_y",
group_var = "disease_subtype",
color_var = "treatment_response",
size_var = "severity_score",
hull_concavity = 1.5, # Balanced shape
hull_alpha = 0.3, # Good transparency
hull_expand = 0.08, # Appropriate padding
show_labels = TRUE,
show_statistics = TRUE,
confidence_ellipses = TRUE,
color_palette = "viridis", # Professional colors
plot_theme = "clinical", # Clean theme
point_alpha = 0.7, # Visible but not overwhelming
plot_title = "Best Practices Example: Clinical Biomarker Analysis",
x_label = "Primary Biomarker Expression",
y_label = "Secondary Biomarker Expression"
)
cat("Best Practices Example Created\n")
## Best Practices Example Created
cat("Professional formatting with comprehensive features\n")
## Professional formatting with comprehensive features
Summary
The hullplot
function in ClinicoPath provides powerful
visualization capabilities for group analysis and cluster
identification. Key advantages include:
- Visual Clarity: Hull polygons clearly show group boundaries and overlaps
- Multi-Dimensional: Support for grouping, coloring, and sizing variables
- Statistical Features: Group statistics, outlier detection, and confidence intervals
- Customization: Extensive options for colors, themes, and appearance
- Professional Output: Publication-ready plots for research and business
- Versatile Applications: Customer segmentation, clinical research, quality control, survey analysis
This function is particularly valuable for: - Customer segmentation and market research - Clinical patient subgroup identification - Experimental condition comparison - Quality control and manufacturing analysis - Survey data visualization - Scientific research publications
The hull plot visualization method is excellent for presentations and publications as it clearly communicates group structures and relationships in complex datasets.
This vignette demonstrates the comprehensive capabilities of the hullplot function in ClinicoPath. Hull plots provide an intuitive and powerful way to visualize group structures in multidimensional data, making them essential tools for data analysis and presentation.