Skip to contents

Performs high-dimensional Cox proportional hazards regression using advanced regularization methods (LASSO, Ridge, Elastic Net, Adaptive LASSO) for survival data with many predictors. This method is essential when the number of variables is large relative to the number of observations, enabling variable selection and preventing overfitting in genomic, proteomic, or other high-dimensional clinical datasets.

Usage

highdimcox(
  data,
  elapsedtime,
  outcome,
  predictors,
  outcomeLevel = "1",
  regularization_method = "elastic_net",
  alpha_value = 0.5,
  cv_method = "cv_1se",
  cv_folds = 10,
  variable_selection = "none",
  stability_selection = FALSE,
  bootstrap_iterations = 500,
  stability_threshold = 0.8,
  show_regularization_path = TRUE,
  show_cv_plot = TRUE,
  show_variable_importance = TRUE,
  show_coefficients_table = TRUE,
  show_model_diagnostics = TRUE,
  showSummaries = FALSE,
  showExplanations = FALSE
)

Arguments

data

the data as a data frame

elapsedtime

Time variable for survival analysis

outcome

Event indicator variable

predictors

High-dimensional predictor variables

outcomeLevel

Level of outcome variable indicating event

regularization_method

Regularization method for high-dimensional Cox regression

alpha_value

Alpha parameter for elastic net (0=ridge, 1=lasso)

cv_method

Cross-validation method for lambda selection

cv_folds

Number of cross-validation folds

variable_selection

Additional variable selection method after regularization

stability_selection

Perform stability selection for variable importance

bootstrap_iterations

Number of bootstrap iterations for stability selection

stability_threshold

Threshold for stability selection

show_regularization_path

Display regularization path plot

show_cv_plot

Display cross-validation error plot

show_variable_importance

Display variable importance plot

show_coefficients_table

Display selected coefficients table

show_model_diagnostics

Display model diagnostic plots

showSummaries

Generate natural language summaries

showExplanations

Show methodology explanations

Value

A results object containing:

results$todoa html
results$modelSummarya html
results$selectedVariablesa table
results$regularizationMetricsa table
results$stabilityResultsa table
results$dimensionalityReductiona table
results$regularizationPathan image
results$cvPlotan image
results$variableImportancean image
results$modelDiagnosticsan image
results$stabilityPlotan image
results$analysisSummarya html
results$methodExplanationa html

Tables can be converted to data frames with asDF or as.data.frame. For example:

results$selectedVariables$asDF

as.data.frame(results$selectedVariables)

Examples

# Example 1: LASSO regularization for gene expression data
library(survival)
library(glmnet)

highdimcox(
    data = genomic_survival_data,
    elapsedtime = "time",
    outcome = "status",
    outcomeLevel = "1",
    predictors = c("gene1", "gene2", "gene3", "...gene1000"),
    regularization_method = "lasso",
    cv_folds = 10
)

# Example 2: Elastic Net with stability selection
highdimcox(
    data = protein_data,
    elapsedtime = "survival_time",
    outcome = "event",
    outcomeLevel = "1",
    predictors = c("protein1", "protein2", "...protein500"),
    regularization_method = "elastic_net",
    alpha_value = 0.5,
    stability_selection = TRUE,
    bootstrap_iterations = 500
)