Edge Cases and Robustness Testing Dataset
Source:R/data_toolssummary_docs.R
toolssummary_edge_cases.Rd
Specialized dataset containing various edge cases, extreme values, special characters, and challenging data patterns. Designed to test the robustness of enhanced summary implementations, error handling capabilities, and graceful degradation with problematic data using summarytools comprehensive error management.
Format
A data frame with 150 observations and 12 variables:
- id
Character. Unique identifier (EDGE_001 to EDGE_150)
- scenario
Factor. Test scenario type (1-6)
- numeric_extreme
Numeric. Variable with extreme values, zeros, and systematic missing
- numeric_decimal
Numeric. Variable with high-precision decimals (8 decimal places)
- categorical_many
Factor. Categorical variable with 30 different levels
- categorical_few
Factor. Binary categorical variable ("A", "B")
- categorical_special
Factor. Categories with missing-like values ("N/A", "Missing", "")
- constant_numeric
Numeric. Constant value (all 42) for edge case testing
- constant_factor
Factor. Constant factor (all "SAME") for edge case testing
- text_variable
Character. Text with varying lengths and special characters
- date_variable
Date. Date variable spanning 2020-2025
- binary_numeric
Integer. Binary numeric variable (0, 1)
Details
This dataset is designed to stress-test enhanced summary implementations with various edge cases and challenging data patterns that might cause failures in standard analysis approaches. It tests summarytools robustness and error handling capabilities.
Scenario Types:
Scenario 1: Normal case (baseline comparison)
Scenario 2: Extreme values (very large positive and negative numbers)
Scenario 3: Very small values (near-zero decimals)
Scenario 4: High precision decimals and complex numbers
Scenario 5: Zero and edge numeric values
Scenario 6: Systematic missing values
Quality Challenges:
Extreme numeric ranges and precision requirements
Categorical variables with many levels (30 categories)
Special characters and empty strings in categorical data
Systematic missing patterns (every 6th and 7th observation)
Constant values across all observations
Text variables with varying lengths and special characters
summarytools Integration Testing:
dfSummary: Robustness with problematic data structures
freq: Handling of many categories and special characters
descr: Extreme value processing and missing data management
ctable: Edge case cross-tabulation scenarios
Recommended Usage Scenarios:
Robustness testing for all summary functions
Error handling validation
Special character and encoding testing
Missing data pattern analysis
Examples
if (FALSE) { # \dontrun{
# Load the dataset
data(toolssummary_edge_cases)
# Robustness testing with edge cases
result <- toolssummary(
data = toolssummary_edge_cases,
vars = c("numeric_extreme", "categorical_many", "text_variable"),
useSummarytools = TRUE,
showDfSummary = TRUE,
showFreq = TRUE
)
# Scenario-based analysis
result_scenario <- toolssummary(
data = toolssummary_edge_cases,
vars = c("numeric_extreme", "categorical_special"),
groupVar = "scenario",
useSummarytools = TRUE,
showCrosstabs = TRUE
)
} # }