Advanced outlier detection using multiple statistical methods from the easystats performance package. This function provides comprehensive outlier detection through univariate methods (Z-scores, IQR, confidence intervals), multivariate methods (Mahalanobis distance, MCD, OPTICS, LOF), and composite scoring across multiple algorithms. Complements existing data quality assessment modules with state-of-the-art outlier detection capabilities. Perfect for clinical research data quality control and preprocessing.
Value
A jamovi analysis object containing outlier detection results with tables, plots, and interpretation based on selected options
Details
The outlier detection module supports four main categories of methods:
Univariate Methods:
Robust Z-Score (MAD-based): Uses median absolute deviation for robust standardization
Standard Z-Score: Classical z-score based on mean and standard deviation
Interquartile Range (IQR): Tukey's method using quartiles and IQR multiplier
Equal-Tailed Interval (ETI): Symmetric confidence interval approach
Highest Density Interval (HDI): Bayesian credible interval method
Multivariate Methods:
Mahalanobis Distance: Classical multivariate distance accounting for covariance
Robust Mahalanobis Distance: Robust version using minimum covariance determinant
Minimum Covariance Determinant (MCD): Robust covariance estimation
OPTICS Clustering: Density-based clustering approach
Local Outlier Factor (LOF): Local density deviation method
Composite Methods: Combine multiple algorithms for robust detection with adjustable thresholds
All Methods: Comprehensive analysis using all available techniques
Method Selection Guidelines
Univariate: When analyzing variables independently, simple interpretation needed
Multivariate: When variable relationships matter, detecting complex outlier patterns
Composite: When robust detection across different data patterns is needed
All: For comprehensive analysis and method comparison
Threshold Recommendations
Z-Score: 3.29 (99.9% confidence, ~0.1% outliers)
IQR Multiplier: 1.7 (more conservative than Tukey's 1.5)
Confidence Level: 0.999 (99.9% for interval methods)
Composite Threshold: 0.5 (outliers detected by ≥50% of methods)
Clinical Applications
Laboratory Data: CBC, chemistry panels, liver function tests
Anthropometric Data: Height, weight, BMI measurements
Physiological Data: Blood pressure, heart rate, temperature
Biomarker Data: Protein levels, genetic markers, metabolites
Quality Control: Data entry errors, instrument malfunctions
Output Components
Outlier Table: Detailed results with outlier scores and classifications
Method Comparison: Performance across different detection algorithms
Exclusion Summary: Recommendations for data cleaning procedures
Visualization: Plots showing outlier patterns and distributions
Interpretation: Detailed guidance on results and methodology
Statistical Considerations
Sample Size: Minimum 30 observations recommended for robust results
Distribution: Robust methods handle non-normal distributions better
Missing Data: Complete cases analysis performed automatically
Correlations: Multivariate methods account for variable relationships
False Positives: Conservative thresholds reduce over-detection
References
Lüdecke, D., Ben-Shachar, M., Patil, I., Waggoner, P., & Makowski, D. (2021). performance: An R Package for Assessment, Comparison and Testing of Statistical Models. Journal of Open Source Software, 6(60), 3139. https://doi.org/10.21105/joss.03139
Rousseeuw, P. J., & Hubert, M. (2018). Anomaly detection by robust statistics. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(2), e1236.
Breunig, M. M., Kriegel, H. P., Ng, R. T., & Sander, J. (2000). LOF: identifying density-based local outliers. ACM sigmod record, 29(2), 93-104.
See also
check_outliers
for the underlying outlier detection functions
Super classes
jmvcore::Analysis
-> ClinicoPath::outlierdetectionBase
-> outlierdetectionClass
Methods
Inherited methods
jmvcore::Analysis$.createImage()
jmvcore::Analysis$.createImages()
jmvcore::Analysis$.createPlotObject()
jmvcore::Analysis$.load()
jmvcore::Analysis$.render()
jmvcore::Analysis$.save()
jmvcore::Analysis$.savePart()
jmvcore::Analysis$.setCheckpoint()
jmvcore::Analysis$.setParent()
jmvcore::Analysis$.setReadDatasetHeaderSource()
jmvcore::Analysis$.setReadDatasetSource()
jmvcore::Analysis$.setResourcesPathSource()
jmvcore::Analysis$.setStatePathSource()
jmvcore::Analysis$addAddon()
jmvcore::Analysis$asProtoBuf()
jmvcore::Analysis$asSource()
jmvcore::Analysis$check()
jmvcore::Analysis$init()
jmvcore::Analysis$optionsChangedHandler()
jmvcore::Analysis$postInit()
jmvcore::Analysis$print()
jmvcore::Analysis$readDataset()
jmvcore::Analysis$run()
jmvcore::Analysis$serialize()
jmvcore::Analysis$setError()
jmvcore::Analysis$setStatus()
jmvcore::Analysis$translate()
ClinicoPath::outlierdetectionBase$initialize()
Examples
if (FALSE) { # \dontrun{
# Example 1: Basic univariate outlier detection
# Load clinical data
data(clinical_data)
# Detect outliers using robust z-score
outlierdetection(
data = clinical_data,
vars = c("hemoglobin", "glucose", "creatinine"),
method_category = "univariate",
univariate_methods = "zscore_robust",
zscore_threshold = 3.29,
show_outlier_table = TRUE,
show_visualization = TRUE
)
# Example 2: Multivariate outlier detection
# Detect multivariate outliers in biomarker data
outlierdetection(
data = biomarker_data,
vars = c("protein_1", "protein_2", "protein_3"),
method_category = "multivariate",
multivariate_methods = "mahalanobis",
show_method_comparison = TRUE,
show_exclusion_summary = TRUE
)
# Example 3: Composite outlier detection
# Robust detection using multiple methods
outlierdetection(
data = patient_data,
vars = c("age", "weight", "height", "bmi"),
method_category = "composite",
composite_threshold = 0.6,
show_outlier_table = TRUE,
show_interpretation = TRUE
)
# Example 4: Comprehensive analysis
# Compare all available methods
outlierdetection(
data = lab_data,
vars = c("alt", "ast", "bilirubin", "albumin"),
method_category = "all",
show_method_comparison = TRUE,
show_exclusion_summary = TRUE,
show_visualization = TRUE
)
} # }