PCA Component Significance Test — pcacomponenttest • ClinicoPath

Usage

pcacomponenttest(
  data,
  vars,
  ncomp = 5,
  nperm = 1000,
  center = TRUE,
  scale = TRUE,
  conflevel = 0.95,
  adjustmethod = "BH",
  showpercent = TRUE,
  colororiginal = "steelblue",
  colorpermuted = "orange",
  plotwidth = 600,
  plotheight = 450
)

Arguments

data

The data as a data frame.

vars

Continuous variables to include in Principal Component Analysis. Select at least 3 numeric variables.

ncomp

Number of principal components to test for significance (1 to 20). Testing will be performed for PC1 through PCncomp.

nperm

Number of permutations to generate null distribution (100-10000). Higher values provide more accurate p-values but take longer. Minimum p-value = 1/(nperm+1). For p<0.001, use nperm>=1000.

center

Center variables to have mean = 0 before PCA. Recommended: TRUE for most analyses.

scale

Scale variables to have standard deviation = 1 before PCA. Recommended: TRUE when variables have different units or scales.

conflevel

Confidence level for confidence intervals (0.80-0.99). Default: 0.95 for 95\

adjustmethodMethod for adjusting p-values for multiple testing. BH (Benjamini-Hochberg) controls false discovery rate.

showpercentDisplay variance accounted for (VAF) as percentage (0-100) instead of proportion (0-1).

colororiginalColor for original VAF line/points. Use color names or hex codes.

colorpermutedColor for permuted VAF line/points. Use color names or hex codes.

plotwidthWidth of the plot in pixels.

plotheightHeight of the plot in pixels.

A results object containing:

`results$todo`					a html
`results$results`					Statistical significance of principal components based on permutation testing
`results$vafplot`					Visualization comparing original VAF to permuted null distribution

Tables can be converted to data frames with asDF or as.data.frame. For example:results$results$asDFas.data.frame(results$results) Performs permutation-based significance testing to determine which principal components explain more variance than expected by random chance. This provides an objective, hypothesis-tested approach to component retention. The test uses nonparametric permutation where variables are shuffled independently to break data structure, generating a null distribution of variance accounted for (VAF). Components with VAF significantly higher than the null distribution are retained.This implements the method described in Buja & Eyuboglu (1992) and is based on the syndRomics package approach. ReferencesBuja A, Eyuboglu N. (1992). Remarks on Parallel Analysis. Multivariate Behavioral Research, 27(4):509-540.Torres-Espin A, Chou A, Huie JR, et al. (2021). Reproducible analysis of disease space via principal components using the novel R package syndRomics. eLife, 10:e61812.

# Example with mtcars dataset data("mtcars")# Test significance of first 5 principal components pcacomponenttest( data = mtcars, vars = c("mpg", "disp", "hp", "drat", "wt", "qsec"), ncomp = 5, nperm = 1000, center = TRUE, scale = TRUE, conflevel = 0.95, adjustmethod = "BH" )