Skip to contents

Comprehensive feature quality assessment for clinical and pathology research data. Provides essential data quality control including missing data analysis, outlier detection, distribution analysis, correlation assessment, and variance evaluation. Implements multiple quality metrics with clinical interpretation and actionable recommendations. Critical for ensuring data integrity before statistical analysis and meeting regulatory standards for clinical research.

Usage

featurequality(
  data,
  features,
  group_var = NULL,
  analysis_scope = "comprehensive",
  missing_data_analysis = TRUE,
  distribution_analysis = TRUE,
  outlier_detection = TRUE,
  outlier_method = "multiple",
  outlier_threshold = 3,
  correlation_analysis = TRUE,
  correlation_threshold = 0.8,
  variance_analysis = TRUE,
  low_variance_threshold = 0.01,
  normality_testing = TRUE,
  normality_method = "multiple",
  skewness_analysis = TRUE,
  feature_importance = FALSE,
  importance_method = "random_forest",
  data_transformation = TRUE,
  quality_score = TRUE,
  detailed_plots = TRUE,
  plot_distributions = TRUE,
  plot_correlations = TRUE,
  plot_outliers = TRUE,
  plot_missing = TRUE,
  export_recommendations = FALSE,
  clinical_context = TRUE,
  batch_processing = TRUE,
  confidence_level = 0.95,
  random_seed = 123
)

Arguments

data

.

features

.

group_var

.

analysis_scope

.

missing_data_analysis

.

distribution_analysis

.

outlier_detection

.

outlier_method

.

outlier_threshold

.

correlation_analysis

.

correlation_threshold

.

variance_analysis

.

low_variance_threshold

.

normality_testing

.

normality_method

.

skewness_analysis

.

feature_importance

.

importance_method

.

data_transformation

.

quality_score

.

detailed_plots

.

plot_distributions

.

plot_correlations

.

plot_outliers

.

plot_missing

.

export_recommendations

.

clinical_context

.

batch_processing

.

confidence_level

.

random_seed

.

Value

A results object containing:

results$instructionsa html
results$progressa html
results$quality_summarya table
results$missing_data_analysisa table
results$distribution_analysisa table
results$outlier_analysisa table
results$correlation_analysisa table
results$variance_analysisa table
results$normality_analysisa table
results$feature_importancea table
results$transformation_recommendationsa table
results$overall_recommendationsa html
results$clinical_interpretationa html
results$distribution_plotan image
results$correlation_plotan image
results$outlier_plotan image
results$missing_plotan image
results$quality_dashboardan image

Tables can be converted to data frames with asDF or as.data.frame. For example:

results$quality_summary$asDF

as.data.frame(results$quality_summary)

Examples

data('clinical_data')

featurequality(
    data = clinical_data,
    features = c("biomarker1", "biomarker2", "age", "severity"),
    outlier_detection = TRUE,
    correlation_analysis = TRUE
)