Skip to contents

Automatically selects and generates the most appropriate statistical visualization based on variable data types. Features enhanced error messages with contextual guidance, robust data validation, and comprehensive fallback options. Supports both independent and repeated measures designs with various plot types including violin plots, scatter plots, bar charts, and alluvial diagrams.

Usage

statsplot2(
  data,
  dep,
  group,
  grvar = NULL,
  direction = "independent",
  distribution = "p",
  alluvsty = "t1",
  excl = FALSE,
  sampleLarge = TRUE
)

Arguments

data

The data as a data frame.

dep

The dependent variable (y-axis, 1st measurement). Can be continuous or categorical.

group

The grouping variable (x-axis, 2nd measurement). Can be continuous or categorical.

grvar

Optional grouping variable for creating grouped plots across multiple panels.

direction

Measurement design type. "independent" for between-subjects comparisons, "repeated" for within-subjects/repeated measures comparisons.

distribution

Statistical approach: "p" = parametric, "np" = nonparametric, "r" = robust, "bf" = Bayes factor.

alluvsty

Style for alluvial diagrams: "t1" = ggalluvial with stratum labels, "t2" = easyalluvial with automatic variable selection.

excl

If TRUE, excludes rows with missing values before analysis.

sampleLarge

If TRUE, automatically samples large datasets (>10,000 rows) to 5,000 rows for improved performance.

Value

A results object containing:

results$todoa html
results$ExplanationMessagea preformatted
results$plotan image

Examples

# Automatic plot selection for factor vs continuous variables
statsplot2(
    data = mtcars,
    dep = "mpg",
    group = "cyl",
    direction = "independent",
    distribution = "p"
)
#> 
#>  AUTOMATIC PLOT SELECTION BASED ON VARIABLE TYPES
#> You have selected to use a scatter plot to examine the relationship between cyl (continuous) and mpg (continuous).
#> 
#> Clinical Interpretation: This scatter plot examines the linear relationship between cyl and mpg. The trend line shows the association with confidence bands (gray area). Positive slopes indicate that higher cyl values are associated with higher mpg values.
#> 
#> Parametric Approach: Assumes normally distributed data. Best for continuous variables with bell-shaped distributions.
#> 
#> ⚠️ STATISTICAL WARNINGS:
#> With smaller samples, check data distribution visually. Consider nonparametric approach if data appears skewed.


# Repeated measures with alluvial diagram
statsplot2(
    data = survey_data,
    dep = "condition_baseline",
    group = "condition_followup",
    direction = "repeated",
    alluvsty = "t1"
)
#> Error: object 'survey_data' not found

# Enhanced error messages provide contextual guidance:
# - Variable names and types in error messages
# - Specific data requirement feedback
# - Package installation instructions when needed
# - Actionable suggestions for unsupported combinations