Enhanced Factor Variable Analysis with BlueSky Integration
Source:R/enhancedfactorvariable.h.R
enhancedfactorvariable.RdEnhanced factor variable analysis toolkit with BlueSky integration for comprehensive categorical variable profiling. Includes detailed factor level counting, top N factor display, summary statistics, and enhanced visualization capabilities. Perfect for exploring categorical pathology variables, biomarker categories, and clinical classification systems with advanced filtering and display options.
Usage
enhancedfactorvariable(
data,
vars,
showOnlyTopFactors = TRUE,
maxTopFactors = 30,
sortingMethod = "freq_desc",
showDatasetOverview = TRUE,
showNominalSummary = TRUE,
showDetailedLevels = TRUE,
showCombinedAnalysis = FALSE,
includePercentages = TRUE,
includeValidPercentages = TRUE,
includeCumulativeStats = FALSE,
includeMissing = TRUE,
missingLabel = "<Missing>",
minimumCount = 1,
minimumPercentage = 0,
excludeRareFactors = FALSE,
rareFactorThreshold = 1,
rareFactorLabel = "Other",
factorComplexityAnalysis = FALSE,
levelBalanceAnalysis = FALSE,
bluesky_integration = TRUE,
comprehensive_output = FALSE,
clinical_interpretation = TRUE,
createVisualizations = TRUE,
plotTopFactorsOnly = TRUE,
plotOrientation = "horizontal",
plotTheme = "clinical",
outputFormat = "standard",
includeMethodology = FALSE
)Arguments
- data
The data as a data frame.
- vars
Factor variables for detailed level analysis (will be converted to factors if needed)
- showOnlyTopFactors
Display only the most frequent factor levels instead of all levels
- maxTopFactors
Number of top factor levels to display when limiting output
- sortingMethod
Method for ordering factor levels in output tables
- showDatasetOverview
Display dataset summary with variable counts and observations
- showNominalSummary
Display summary statistics for all factor variables
- showDetailedLevels
Display detailed counts for each factor level
- showCombinedAnalysis
Display all factors in a single combined table
- includePercentages
Calculate and display percentages for factor levels
- includeValidPercentages
Calculate percentages excluding missing values
- includeCumulativeStats
Calculate and display cumulative frequencies and percentages
- includeMissing
Include missing values (NA) in factor level counts
- missingLabel
Label to display for missing factor levels
- minimumCount
Hide factor levels with counts below this threshold
- minimumPercentage
Hide factor levels with percentages below this threshold
- excludeRareFactors
Group infrequent factor levels into "Other" category
- rareFactorThreshold
Percentage threshold below which factors are considered rare
- rareFactorLabel
Label for grouped rare factor levels
- factorComplexityAnalysis
Analyze factor complexity with entropy and diversity measures
- levelBalanceAnalysis
Analyze balance between factor levels with statistical measures
- bluesky_integration
Use BlueSky R statistical environment features and algorithms
- comprehensive_output
Include comprehensive statistical details and diagnostics
- clinical_interpretation
Provide clinical context for factor variable analysis results
- createVisualizations
Generate plots for factor variable analysis
- plotTopFactorsOnly
Limit plots to top factor levels only
- plotOrientation
Orientation for factor level bar plots
- plotTheme
Visual theme for factor analysis plots
- outputFormat
Output format style for tables and results
- includeMethodology
Include detailed methodology and references in output
Value
A results object containing:
results$results$instructions | a html | ||||
results$results$datasetOverview | Summary of dataset characteristics for factor analysis | ||||
results$results$nominalSummary | Summary statistics for all factor variables | ||||
results$results$detailedLevelsAnalysis | Detailed counts and percentages for each factor level | ||||
results$results$combinedFactorAnalysis | All factor variables combined in a single analysis table | ||||
results$results$complexityAnalysis | Complexity metrics for factor variables including entropy and diversity | ||||
results$results$levelBalanceAnalysis | Balance metrics for factor level distributions | ||||
results$results$comprehensiveAnalysisSummary | Enhanced statistical summary with BlueSky integration | ||||
results$results$clinicalInterpretationGuide | a html | ||||
results$results$methodsExplanation | a html | ||||
results$results$factorDistributionPlot | Overview of factor level distributions across all variables | ||||
results$results$topFactorsPlot | Bar plot showing most frequent factor levels | ||||
results$results$complexityPlot | Visualization of factor complexity metrics | ||||
results$results$balanceAnalysisPlot | Visualization of factor level balance across variables |