Skip to contents

Creates comprehensive river plots, alluvial diagrams, and stream charts to visualize flows, transitions, and temporal changes in categorical data. Supports individual entity tracking, aggregate trends, and multi-stage pathways. Perfect for patient journeys, treatment progressions, categorical changes over time, and flow analysis.

Usage

riverplot(
  data,
  id = NULL,
  time = NULL,
  strata = list(),
  weight = NULL,
  plotType = "alluvial",
  fillType = "first",
  dataFormat = "auto",
  sortStreams = TRUE,
  labelNodes = TRUE,
  curveType = "cubic",
  showCounts = FALSE,
  showPercentages = FALSE,
  showLegend = TRUE,
  legendPosition = "right",
  colorScheme = "default",
  customColors = "",
  nodeWidth = 0.1,
  nodeGap = 0.05,
  flowAlpha = 0.7,
  mytitle = "",
  xtitle = "",
  ytitle = "",
  originaltheme = FALSE,
  fontSize = 12,
  plotWidth = 10,
  plotHeight = 8,
  enableDiagnostics = FALSE,
  edgeGradient = TRUE,
  nodeStyle = "regular",
  edgeStyle = "gradient",
  gravity = "center",
  addMidPoints = FALSE,
  reorderEdges = FALSE,
  curveGranularity = 100,
  backgroundLabels = FALSE,
  exportRiverplotObject = FALSE,
  clinicalPreset = "none",
  smart_detection = TRUE,
  enhanced_validation = TRUE,
  multi_format_support = FALSE,
  adaptive_styling = TRUE,
  quality_optimization = FALSE,
  cross_reference_mode = FALSE,
  high_contrast_mode = FALSE,
  large_dataset_mode = FALSE,
  enable_caching = TRUE,
  detailed_progress = FALSE
)

Arguments

data

The data as a data frame.

id

Optional identifier for tracking individual entities (patients, cases, customers) through transitions. Required for individual-level flow tracking. Leave empty for aggregate flow visualization.

time

Variable representing time points or sequential stages (visit dates, follow-up periods, sequential stages). Required for longitudinal data with single strata variable. Leave empty when using wide format data with multiple strata variables.

strata

Variables containing the categories that flow between stages. For longitudinal data: single variable with categories (treatment_response, disease_stage). For wide format: multiple variables representing sequential stages (baseline_status, month3_status, month6_status). Each should be a factor with meaningful category names.

weight

Optional numerical variable to determine flow width proportional to values. Examples: patient count, cost, severity score, revenue. If not provided, flows represent equal counts/frequencies. Use for emphasizing quantitative importance of different pathways.

plotType

Type of flow visualization. 'alluvial' creates flowing streams with curved connections, ideal for transitions. 'sankey' creates directed flow diagrams with straighter connections, good for process flows. 'stream' shows aggregate category trends over time. 'flow' tracks individual entities through time periods.

fillType

Determines how colors are assigned to flows. 'first' colors flows based on initial category (tracking origins). 'last' colors by final category (tracking destinations). 'frequency' uses flow volume for colors (emphasizes major pathways).

dataFormat

Format of input data. 'long' format has rows for each time point with single strata variable. 'wide' format has multiple strata variables as columns representing different stages. 'auto' detects format based on variables provided.

sortStreams

Sort alluvial streams by frequency/size. Larger flows are positioned more prominently, making major pathways easier to identify. Recommended for highlighting dominant patterns.

labelNodes

Display category labels on nodes (vertical bars). Labels identify what each segment represents. Recommended unless space is limited or labels overlap.

curveType

Shape of connecting curves between stages. 'cardinal' creates smooth flowing curves (most aesthetic). 'sin' creates sinusoidal curves (CRAN riverplot style). 'riverplot' uses the advanced curve generation from CRAN riverplot package.

showCounts

Display numerical counts or values on nodes. Shows exact frequencies or sums for each category at each stage. Useful for precise interpretation but may clutter visualization with many categories.

showPercentages

Display percentages on nodes or flows. Shows relative proportions within each stage or flow. Complement to absolute counts for better interpretation.

showLegend

Display color legend explaining category mappings. Recommended when colors represent meaningful categories. Can be hidden to save space if category labels are clearly visible.

legendPosition

Position of the color legend relative to the plot.

colorScheme

Color palette for categories. Choose based on data type and aesthetic preference.

customColors

Comma-separated hex colors for custom color scheme (e.g., "#1f77b4,#ff7f0e,#2ca02c"). Only used when Color Scheme is set to 'Custom'.

nodeWidth

Width of nodes (vertical bars) as proportion of plot width. Adjust for better balance between nodes and flows.

nodeGap

Gap between categories within each node as proportion of node height. Smaller values create more compact nodes.

flowAlpha

Transparency of flow connections (0 = fully transparent, 1 = opaque). Lower values help see overlapping flows.

mytitle

Main title for the plot. Leave empty for no title.

xtitle

Label for x-axis (time/stage axis). Leave empty for default label.

ytitle

Label for y-axis (frequency/count axis). Leave empty for default label.

originaltheme

Use original ggalluvial package theme instead of minimal theme. May provide better default styling for some plot types.

fontSize

Base font size for labels and text in the plot.

plotWidth

Width of the plot in inches for export.

plotHeight

Height of the plot in inches for export.

enableDiagnostics

Display diagnostic information about data processing, flow calculations, and potential issues. Useful for troubleshooting complex datasets.

edgeGradient

Apply color gradients to edges for smoother visual transitions. Based on CRAN riverplot package gradient capabilities.

nodeStyle

Visual style for nodes. 'regular' shows filled rectangles, 'point' shows small circles, 'invisible' hides nodes but maintains connections.

edgeStyle

Visual style for edges. 'gradient' creates smooth color transitions, 'sin' varies edge thickness sinusoidally for aesthetic effect.

gravity

Vertical alignment strategy for nodes within each stage. Affects how nodes are positioned relative to each other.

addMidPoints

Add intermediate invisible nodes to create smoother edge paths. Helps reduce edge crossings and improve visual clarity (CRAN riverplot feature).

reorderEdges

Automatically attempt to disentangle and optimize edge routing to reduce crossings. Uses algorithms from CRAN riverplot package.

curveGranularity

Number of points used to generate smooth curves. Higher values create smoother curves but increase rendering time.

backgroundLabels

Add background boxes to labels for better readability, similar to CRAN riverplot's bglabel functionality.

exportRiverplotObject

Create and display a riverplot-compatible object structure for use with the CRAN riverplot package.

clinicalPreset

Apply predefined configurations optimized for common clinical scenarios. Each preset automatically configures plot type, styling, and display options for the selected clinical use case.

smart_detection

Automatically detect optimal data format and analysis approach based on provided variables.

enhanced_validation

Apply comprehensive data validation with user-friendly error messages and suggestions.

multi_format_support

Enable experimental support for additional data formats (source-target, multi-node).

adaptive_styling

Automatically adapt plot styling based on data characteristics and size.

quality_optimization

Apply advanced algorithms to optimize flow layout, reduce crossings, and improve readability.

cross_reference_mode

Show suggestions for when alluvial function might be more appropriate for the data.

high_contrast_mode

Enable high contrast colors and larger fonts for better accessibility and readability.

large_dataset_mode

Enable memory-efficient processing for datasets with >10,000 observations. May reduce rendering detail for performance.

enable_caching

Cache computation results to improve performance with repeated analysis of the same data.

detailed_progress

Show detailed progress information during long-running operations. Useful for very large datasets.

Value

A results object containing:

results$todoa html
results$summarya html
results$reportSentencea html
results$about_analysisa html
results$plotan image
results$diagnosticsa html
results$flowTablea table
results$stageTablea table
results$transitionMatrixa table
results$riverplotObjecta html
results$riverplotCodea html
results$validation_reporta html
results$cross_reference_suggestionsa html
results$optimization_reporta html
results$clinical_insightsa html
results$enhanced_caveatsa html

Tables can be converted to data frames with asDF or as.data.frame. For example:

results$flowTable$asDF

as.data.frame(results$flowTable)

Examples

# \donttest{
# Basic temporal flow - Track treatment response changes over time
riverplot(
  data = patient_data,
  time = "follow_up_month",
  strata = "treatment_response",
  plotType = "alluvial",
  labelNodes = TRUE
)

# Individual patient tracking with ID
riverplot(
  data = longitudinal_data,
  id = "patient_id",
  time = "visit_date",
  strata = "disease_stage",
  plotType = "flow",
  fillType = "last",
  showCounts = TRUE
)

# Multi-stage pathway (wide format data)
riverplot(
  data = pathway_data,
  strata = c("baseline_status", "month3_status", "month6_status"),
  plotType = "sankey",
  fillType = "frequency",
  curveType = "cardinal"
)

# Weighted stream chart - Account for patient counts or values
riverplot(
  data = aggregate_data,
  time = "quarter",
  strata = "outcome_category",
  weight = "patient_count",
  plotType = "stream",
  showLegend = TRUE
)
# }