Skip to contents

Performs univariate survival analysis for multiple features to identify potential prognostic factors. This analysis runs a separate Cox proportional hazards model for each selected feature and ranks them by statistical significance, hazard ratio, or concordance index. Useful for biomarker screening, exploratory analysis, and feature selection before building multivariable models.

Usage

survivalfeaturerank(
  data,
  survtime = NULL,
  event = NULL,
  eventLevel,
  features = NULL,
  rankBy = "pvalue",
  showCI = TRUE,
  adjustPValues = FALSE,
  adjustMethod = "fdr",
  showForestPlot = TRUE,
  forestStyle = "standard",
  showTopKM = FALSE,
  topN = 5,
  kmLayout = "separate",
  endplot = 60,
  byplot = 12,
  risktable = FALSE,
  pplot = TRUE,
  showSummary = TRUE,
  alphaLevel = 0.05,
  showFullTable = TRUE
)

Arguments

data

The dataset to be analyzed, provided as a data frame.

survtime

The numeric variable representing follow-up time until the event or censoring.

event

The event indicator variable (e.g., death, recurrence). Should be coded as 1 = event, 0 = censored, or as a factor where one level represents the event.

eventLevel

The level of the event variable that represents the event of interest. For numeric variables, typically "1". For factors, select the appropriate level.

features

Variables to be tested individually for association with survival. Each variable will be tested in a separate univariate Cox model. Can include both categorical and continuous variables.

rankBy

Criterion for ranking features: - pvalue: Sorts by statistical significance (smallest p-value first) - hazard: Sorts by effect size (hazard ratio furthest from 1) - cindex: Sorts by discriminative ability (highest C-index first)

showCI

If true, displays 95\ in the results table.

adjustPValues

If true, applies multiple testing correction to p-values. Recommended when testing many features to control false discovery rate.

adjustMethod

Method for p-value adjustment. FDR (Benjamini-Hochberg) is recommended for exploratory analysis. Bonferroni is most conservative.

showForestPlot

If true, generates a forest plot showing hazard ratios and confidence intervals for all features, sorted by the ranking criterion.

forestStyle

Visual style for the forest plot.

showTopKM

If true, generates Kaplan-Meier survival curves for the top-ranked features. Useful for visualizing the survival differences for the most important predictors.

topN

Number of top-ranked features for which to generate Kaplan-Meier plots. Only used if "Show Kaplan-Meier Plots for Top Features" is enabled.

kmLayout

Layout for multiple Kaplan-Meier plots. Separate shows one plot per feature, faceted shows all in a grid.

endplot

Maximum follow-up time to display on Kaplan-Meier plots.

byplot

Interval for time axis labels on plots.

risktable

If true, displays the number at risk below Kaplan-Meier plots.

pplot

If true, displays log-rank test p-values on Kaplan-Meier plots.

showSummary

If true, displays summary statistics including total number of features tested, number of significant features, and interpretation guidance.

alphaLevel

Significance level for highlighting significant features in the results table.

showFullTable

If true, shows results for all tested features. If false, shows only statistically significant features.

Value

A results object containing:

results$instructionsa html
results$summaryTexta preformatted
results$rankingTablea table
results$forestPlotan image
results$topFeaturesHeadinga html
results$kmPlot1an image
results$kmPlot2an image
results$kmPlot3an image
results$kmPlot4an image
results$kmPlot5an image
results$kmPlot6an image
results$kmPlot7an image
results$kmPlot8an image
results$kmPlot9an image
results$kmPlot10an image
results$exportRankingan output
results$interpretationa html

Tables can be converted to data frames with asDF or as.data.frame. For example:

results$rankingTable$asDF

as.data.frame(results$rankingTable)

Examples

# Example 1: Basic feature ranking
library(survival)
data(colon)

survivalfeaturerank(
    data = colon,
    survtime = "time",
    event = "status",
    eventLevel = "1",
    features = c("sex", "obstruct", "perfor", "adhere", "nodes", "differ"),
    rankBy = "pvalue"
)

# Example 2: Rank by hazard ratio with plots
survivalfeaturerank(
    data = colon,
    survtime = "time",
    event = "status",
    eventLevel = "1",
    features = c("sex", "obstruct", "perfor", "age", "nodes"),
    rankBy = "hazard",
    showForestPlot = TRUE,
    showTopKM = TRUE,
    topN = 3
)

# Example 3: Rank by C-index for predictive power
survivalfeaturerank(
    data = colon,
    survtime = "time",
    event = "status",
    eventLevel = "1",
    features = c("age", "nodes", "extent", "surg"),
    rankBy = "cindex",
    showCI = TRUE,
    adjustPValues = TRUE,
    adjustMethod = "fdr"
)