Usage
pcacomponenttest(
data,
vars,
ncomp = 5,
nperm = 1000,
center = TRUE,
scale = TRUE,
conflevel = 0.95,
adjustmethod = "BH",
showpercent = TRUE,
colororiginal = "steelblue",
colorpermuted = "orange",
plotwidth = 600,
plotheight = 450
)Arguments
- data
The data as a data frame.
- vars
Continuous variables to include in Principal Component Analysis. Select at least 3 numeric variables.
- ncomp
Number of principal components to test for significance (1 to 20). Testing will be performed for PC1 through PCncomp.
- nperm
Number of permutations to generate null distribution (100-10000). Higher values provide more accurate p-values but take longer. Minimum p-value = 1/(nperm+1). For p<0.001, use nperm>=1000.
- center
Center variables to have mean = 0 before PCA. Recommended: TRUE for most analyses.
- scale
Scale variables to have standard deviation = 1 before PCA. Recommended: TRUE when variables have different units or scales.
- conflevel
Confidence level for confidence intervals (0.80-0.99). Default: 0.95 for 95\
adjustmethodMethod for adjusting p-values for multiple testing. BH (Benjamini-Hochberg) controls false discovery rate.
showpercentDisplay variance accounted for (VAF) as percentage (0-100) instead of proportion (0-1).
colororiginalColor for original VAF line/points. Use color names or hex codes.
colorpermutedColor for permuted VAF line/points. Use color names or hex codes.
plotwidthWidth of the plot in pixels.
plotheightHeight of the plot in pixels.
A results object containing:
results$todo | a html | ||||
results$results | Statistical significance of principal components based on permutation testing | ||||
results$vafplot | Visualization comparing original VAF to permuted null distribution |
asDF or as.data.frame. For example:results$results$asDFas.data.frame(results$results)
Performs permutation-based significance testing to determine which
principal
components explain more variance than expected by random chance. This
provides
an objective, hypothesis-tested approach to component retention.
The test uses nonparametric permutation where variables are shuffled
independently
to break data structure, generating a null distribution of variance
accounted for (VAF).
Components with VAF significantly higher than the null distribution are
retained.This implements the method described in Buja & Eyuboglu (1992) and is based
on
the syndRomics package approach.
ReferencesBuja A, Eyuboglu N. (1992). Remarks on Parallel Analysis. Multivariate Behavioral Research, 27(4):509-540.Torres-Espin A, Chou A, Huie JR, et al. (2021). Reproducible analysis of disease space via principal components using the novel R package syndRomics. eLife, 10:e61812.
# Example with mtcars dataset data("mtcars")# Test significance of first 5 principal components pcacomponenttest( data = mtcars, vars = c("mpg", "disp", "hp", "drat", "wt", "qsec"), ncomp = 5, nperm = 1000, center = TRUE, scale = TRUE, conflevel = 0.95, adjustmethod = "BH" )