Automatically selects and generates the most appropriate statistical visualization based on variable data types. Features enhanced error messages with contextual guidance, robust data validation, and comprehensive fallback options. Supports both independent and repeated measures designs with various plot types including violin plots, scatter plots, bar charts, and alluvial diagrams.
Usage
statsplot2(
data,
dep,
group,
grvar = NULL,
direction = "independent",
distribution = "p",
alluvsty = "t1",
excl = FALSE,
sampleLarge = TRUE
)
Arguments
- data
The data as a data frame.
- dep
The dependent variable (y-axis, 1st measurement). Can be continuous or categorical.
- group
The grouping variable (x-axis, 2nd measurement). Can be continuous or categorical.
- grvar
Optional grouping variable for creating grouped plots across multiple panels.
- direction
Measurement design type. "independent" for between-subjects comparisons, "repeated" for within-subjects/repeated measures comparisons.
- distribution
Statistical approach: "p" = parametric, "np" = nonparametric, "r" = robust, "bf" = Bayes factor.
- alluvsty
Style for alluvial diagrams: "t1" = ggalluvial with stratum labels, "t2" = easyalluvial with automatic variable selection.
- excl
If TRUE, excludes rows with missing values before analysis.
- sampleLarge
If TRUE, automatically samples large datasets (>10,000 rows) to 5,000 rows for improved performance.
Value
A results object containing:
results$todo | a html | ||||
results$ExplanationMessage | a preformatted | ||||
results$plot | an image |
Examples
# Automatic plot selection for factor vs continuous variables
statsplot2(
data = mtcars,
dep = "mpg",
group = "cyl",
direction = "independent",
distribution = "p"
)
#>
#> AUTOMATIC PLOT SELECTION BASED ON VARIABLE TYPES
#> You have selected to use a scatter plot to examine the relationship between cyl (continuous) and mpg (continuous).
#>
#> Clinical Interpretation: This scatter plot examines the linear relationship between cyl and mpg. The trend line shows the association with confidence bands (gray area). Positive slopes indicate that higher cyl values are associated with higher mpg values.
#>
#> Parametric Approach: Assumes normally distributed data. Best for continuous variables with bell-shaped distributions.
#>
#> ⚠️ STATISTICAL WARNINGS:
#> With smaller samples, check data distribution visually. Consider nonparametric approach if data appears skewed.
# Repeated measures with alluvial diagram
statsplot2(
data = survey_data,
dep = "condition_baseline",
group = "condition_followup",
direction = "repeated",
alluvsty = "t1"
)
#> Error: object 'survey_data' not found
# Enhanced error messages provide contextual guidance:
# - Variable names and types in error messages
# - Specific data requirement feedback
# - Package installation instructions when needed
# - Actionable suggestions for unsupported combinations