Performs penalized Cox proportional hazards regression using regularization methods (LASSO, Ridge, Elastic Net) for high-dimensional survival data. This method is particularly useful when the number of variables is large relative to the number of observations, or when multicollinearity is present. The regularization helps with variable selection and prevents overfitting.
Usage
penalizedcox(
  data,
  elapsedtime = NULL,
  tint = FALSE,
  dxdate = NULL,
  fudate = NULL,
  timetypedata = "ymd",
  timetypeoutput = "months",
  outcome = NULL,
  outcomeLevel,
  predictors = NULL,
  penalty_type = "lasso",
  alpha = 0.5,
  lambda_selection = "1se",
  lambda_custom = 0.01,
  cv_folds = 10,
  cv_type = "deviance",
  variable_selection = TRUE,
  standardize = TRUE,
  include_intercept = FALSE,
  bootstrap_validation = FALSE,
  bootstrap_samples = 100,
  predict_risk = TRUE,
  survival_curves = FALSE,
  risk_groups = "3",
  coefficient_plot = TRUE,
  cv_plot = TRUE,
  variable_importance = FALSE,
  lambda_sequence = "",
  max_variables = 100,
  convergence_threshold = 1e-07,
  show_coefficients = TRUE,
  show_model_metrics = TRUE,
  show_lambda_path = FALSE,
  showSummaries = FALSE,
  showExplanations = FALSE,
  addRiskScore = FALSE,
  addRiskGroup = FALSE
)Arguments
- data
- The dataset for analysis, provided as a data frame. Should contain survival variables and predictor variables for penalized regression. 
- elapsedtime
- The numeric variable representing follow-up time until the event or censoring. 
- tint
- If true, survival time will be calculated from diagnosis and follow-up dates. 
- dxdate
- Date of diagnosis or start of follow-up. Required if tint = true. 
- fudate
- Follow-up date or date of last observation. Required if tint = true. 
- timetypedata
- Specifies the format of date variables in the input data. 
- timetypeoutput
- The units in which survival time is reported in the output. 
- outcome
- The outcome variable indicating event status (e.g., death, recurrence). 
- outcomeLevel
- The level of outcome considered as the event. 
- predictors
- Variables to include in the penalized Cox regression model. 
- penalty_type
- Type of penalty to apply. LASSO performs variable selection, Ridge shrinks coefficients, Elastic Net combines both. 
- alpha
- Mixing parameter for Elastic Net. Alpha=1 is LASSO, alpha=0 is Ridge. Used only when penalty_type = "elastic_net". 
- lambda_selection
- Method for selecting the regularization parameter lambda. 
- lambda_custom
- Custom lambda value when lambda_selection = "custom". 
- cv_folds
- Number of folds for cross-validation to select optimal lambda. 
- cv_type
- Cross-validation error measure for model selection. 
- variable_selection
- Extract and display selected variables (non-zero coefficients). 
- standardize
- Whether to standardize predictor variables before fitting. 
- include_intercept
- Whether to include an intercept in the model (usually false for Cox models). 
- bootstrap_validation
- Perform bootstrap validation of the penalized model. 
- bootstrap_samples
- Number of bootstrap samples for validation. 
- predict_risk
- Calculate linear predictors (risk scores) for each observation. 
- survival_curves
- Generate survival curves stratified by risk score groups. 
- risk_groups
- Number of risk groups for survival curve stratification. 
- coefficient_plot
- Display coefficient paths showing shrinkage across lambda values. 
- cv_plot
- Display cross-validation error plot for lambda selection. 
- variable_importance
- Display variable importance based on coefficient magnitudes. 
- lambda_sequence
- Custom sequence of lambda values (comma-separated). If empty, glmnet will choose automatically. 
- max_variables
- Maximum number of variables to include in the model path. 
- convergence_threshold
- Convergence threshold for coordinate descent algorithm. 
- show_coefficients
- Display table of selected variables and their coefficients. 
- show_model_metrics
- Display model performance metrics including deviance and C-index. 
- show_lambda_path
- Display detailed information about lambda selection process. 
- showSummaries
- Display natural language summaries alongside tables and plots for interpretation of penalized Cox regression results. 
- showExplanations
- Display detailed explanations of penalized Cox regression methods and interpretation guidelines. 
- addRiskScore
- Add calculated linear predictor (risk score) as new variable to dataset. 
- addRiskGroup
- Add risk group classification as new variable to dataset. 
Value
A results object containing:
| results$todo | a html | ||||
| results$modelSummary | a html | ||||
| results$coefficientTable | a table | ||||
| results$performanceTable | a table | ||||
| results$crossValidationSummary | a html | ||||
| results$lambdaTable | a table | ||||
| results$variableImportanceTable | a table | ||||
| results$riskScoreTable | a table | ||||
| results$riskGroupTable | a table | ||||
| results$bootstrapTable | a table | ||||
| results$coefficientPlot | an image | ||||
| results$cvPlot | an image | ||||
| results$importancePlot | an image | ||||
| results$survivalPlot | an image | ||||
| results$analysisSummary | a html | ||||
| results$methodExplanation | a html | ||||
| results$regularizationExplanation | a html | ||||
| results$variableSelectionExplanation | a html | ||||
| results$crossValidationExplanation | a html | ||||
| results$performanceExplanation | a html | ||||
| results$calculatedtime | an output | ||||
| results$riskScoreOutput | an output | ||||
| results$riskGroupOutput | an output | 
Tables can be converted to data frames with asDF or as.data.frame. For example:
results$coefficientTable$asDF
as.data.frame(results$coefficientTable)
Examples
# Example 1: LASSO penalized Cox regression
library(survival)
library(glmnet)
penalizedcox(
    data = lung_data,
    elapsedtime = "time",
    outcome = "status",
    outcomeLevel = "2",
    predictors = c("age", "sex", "ph.ecog", "ph.karno"),
    penalty_type = "lasso",
    cv_folds = 10
)
# Example 2: Elastic Net with variable selection
penalizedcox(
    data = genomic_data,
    elapsedtime = "survival_time",
    outcome = "event",
    outcomeLevel = "1",
    predictors = c("gene1", "gene2", "gene3", "gene4"),
    penalty_type = "elastic_net",
    alpha = 0.5,
    lambda_selection = "1se",
    variable_selection = TRUE
)