Skip to contents

Automated Exploratory Data Analysis using DataExplorer package integration. This module provides comprehensive automated EDA capabilities including dataset overview, missing value analysis, correlation matrices, PCA visualization, and automated reporting. Based on autoEDA research from R Journal 2019 (Staniak & Biecek).

Usage

autoeda(
  data,
  vars,
  analysis_type = "overview",
  target_var,
  include_plots = TRUE,
  missing_threshold = 5,
  correlation_method = "pearson",
  pca_components = 5,
  plot_theme = "clinical",
  output_format = "combined",
  eda_engine = "dataexplorer",
  advanced_options = FALSE,
  categorical_limit = 15,
  generate_report = FALSE
)

Arguments

data

The data as a data frame.

vars

Variables to include in the automated exploratory data analysis. All variable types are supported.

analysis_type

Choose the type of automated EDA analysis to perform.

target_var

Optional target variable for supervised EDA analysis. Used for target vs predictors analysis.

include_plots

Include automated plots and visualizations in the output.

missing_threshold

Threshold percentage for highlighting variables with missing values.

correlation_method

Method for correlation analysis.

pca_components

Number of principal components to display in PCA analysis.

plot_theme

Visual theme for automated plots.

output_format

Format for the automated EDA output.

eda_engine

Choose the exploratory data analysis engine. DataExplorer provides comprehensive automated reporting. ggEDA provides enhanced visualizations for clinical research.

advanced_options

Enable advanced features and detailed analysis.

categorical_limit

Maximum number of levels for categorical variables to include in analysis.

generate_report

Generate a comprehensive automated EDA report with all analyses.

Value

A results object containing:

results$todoWelcome message and instructions
results$overviewComprehensive dataset introduction and summary
results$missing_analysisMissing value patterns and recommendations
results$distributionsUnivariate distribution analysis and plots
results$correlation_analysisCorrelation matrices and relationship analysis
results$pca_analysisPCA results and component visualization
results$target_analysisTarget vs predictors relationship analysis
results$comprehensive_reportComplete automated exploratory data analysis report
results$plotsCollection of automated plots and charts
results$recommendationsAutomated recommendations for further analysis

Examples

# \donttest{
# Example:
# 1. Load your data frame.
# 2. Select variables for analysis.
# 3. Choose analysis type (overview, missing, correlation, etc.)
# 4. Configure output options.
# 5. Run autoeda module for comprehensive automated analysis.
# }