Performs penalized Cox proportional hazards regression using regularization methods (LASSO, Ridge, Elastic Net) for high-dimensional survival data. This method is particularly useful when the number of variables is large relative to the number of observations, or when multicollinearity is present. The regularization helps with variable selection and prevents overfitting.
Usage
penalizedcox(
data,
elapsedtime = NULL,
tint = FALSE,
dxdate = NULL,
fudate = NULL,
timetypedata = "ymd",
timetypeoutput = "months",
outcome = NULL,
outcomeLevel,
predictors = NULL,
penalty_type = "lasso",
alpha = 0.5,
lambda_selection = "1se",
lambda_custom = 0.01,
cv_folds = 10,
cv_type = "deviance",
variable_selection = TRUE,
standardize = TRUE,
include_intercept = FALSE,
bootstrap_validation = FALSE,
bootstrap_samples = 100,
predict_risk = TRUE,
survival_curves = FALSE,
risk_groups = "3",
coefficient_plot = TRUE,
cv_plot = TRUE,
variable_importance = FALSE,
lambda_sequence = "",
max_variables = 100,
convergence_threshold = 1e-07,
show_coefficients = TRUE,
show_model_metrics = TRUE,
show_lambda_path = FALSE,
showSummaries = FALSE,
showExplanations = FALSE,
addRiskScore = FALSE,
addRiskGroup = FALSE
)Arguments
- data
The dataset for analysis, provided as a data frame. Should contain survival variables and predictor variables for penalized regression.
- elapsedtime
The numeric variable representing follow-up time until the event or censoring.
- tint
If true, survival time will be calculated from diagnosis and follow-up dates.
- dxdate
Date of diagnosis or start of follow-up. Required if tint = true.
- fudate
Follow-up date or date of last observation. Required if tint = true.
- timetypedata
Specifies the format of date variables in the input data.
- timetypeoutput
The units in which survival time is reported in the output.
- outcome
The outcome variable indicating event status (e.g., death, recurrence).
- outcomeLevel
The level of outcome considered as the event.
- predictors
Variables to include in the penalized Cox regression model.
- penalty_type
Type of penalty to apply. LASSO performs variable selection, Ridge shrinks coefficients, Elastic Net combines both.
- alpha
Mixing parameter for Elastic Net. Alpha=1 is LASSO, alpha=0 is Ridge. Used only when penalty_type = "elastic_net".
- lambda_selection
Method for selecting the regularization parameter lambda.
- lambda_custom
Custom lambda value when lambda_selection = "custom".
- cv_folds
Number of folds for cross-validation to select optimal lambda.
- cv_type
Cross-validation error measure for model selection.
- variable_selection
Extract and display selected variables (non-zero coefficients).
- standardize
Whether to standardize predictor variables before fitting.
- include_intercept
Whether to include an intercept in the model (usually false for Cox models).
- bootstrap_validation
Perform bootstrap validation of the penalized model.
- bootstrap_samples
Number of bootstrap samples for validation.
- predict_risk
Calculate linear predictors (risk scores) for each observation.
- survival_curves
Generate survival curves stratified by risk score groups.
- risk_groups
Number of risk groups for survival curve stratification.
- coefficient_plot
Display coefficient paths showing shrinkage across lambda values.
- cv_plot
Display cross-validation error plot for lambda selection.
- variable_importance
Display variable importance based on coefficient magnitudes.
- lambda_sequence
Custom sequence of lambda values (comma-separated). If empty, glmnet will choose automatically.
- max_variables
Maximum number of variables to include in the model path.
- convergence_threshold
Convergence threshold for coordinate descent algorithm.
- show_coefficients
Display table of selected variables and their coefficients.
- show_model_metrics
Display model performance metrics including deviance and C-index.
- show_lambda_path
Display detailed information about lambda selection process.
- showSummaries
Display natural language summaries alongside tables and plots for interpretation of penalized Cox regression results.
- showExplanations
Display detailed explanations of penalized Cox regression methods and interpretation guidelines.
- addRiskScore
Add calculated linear predictor (risk score) as new variable to dataset.
- addRiskGroup
Add risk group classification as new variable to dataset.
Value
A results object containing:
results$todo | a html | ||||
results$modelSummary | a html | ||||
results$coefficientTable | a table | ||||
results$performanceTable | a table | ||||
results$crossValidationSummary | a html | ||||
results$lambdaTable | a table | ||||
results$variableImportanceTable | a table | ||||
results$riskScoreTable | a table | ||||
results$riskGroupTable | a table | ||||
results$bootstrapTable | a table | ||||
results$coefficientPlot | an image | ||||
results$cvPlot | an image | ||||
results$importancePlot | an image | ||||
results$survivalPlot | an image | ||||
results$analysisSummary | a html | ||||
results$methodExplanation | a html | ||||
results$regularizationExplanation | a html | ||||
results$variableSelectionExplanation | a html | ||||
results$crossValidationExplanation | a html | ||||
results$performanceExplanation | a html | ||||
results$calculatedtime | an output | ||||
results$riskScoreOutput | an output | ||||
results$riskGroupOutput | an output |
Tables can be converted to data frames with asDF or as.data.frame. For example:
results$coefficientTable$asDF
as.data.frame(results$coefficientTable)
Examples
# Example 1: LASSO penalized Cox regression
library(survival)
library(glmnet)
penalizedcox(
data = lung_data,
elapsedtime = "time",
outcome = "status",
outcomeLevel = "2",
predictors = c("age", "sex", "ph.ecog", "ph.karno"),
penalty_type = "lasso",
cv_folds = 10
)
# Example 2: Elastic Net with variable selection
penalizedcox(
data = genomic_data,
elapsedtime = "survival_time",
outcome = "event",
outcomeLevel = "1",
predictors = c("gene1", "gene2", "gene3", "gene4"),
penalty_type = "elastic_net",
alpha = 0.5,
lambda_selection = "1se",
variable_selection = TRUE
)