Skip to contents

Harrell's concordance index (C-index) for evaluating discrimination ability of survival prediction models. The C-index measures the proportion of all comparable pairs of patients in which the model correctly ranks their survival times - a generalization of the ROC curve's AUC to censored survival data. Values range from 0.5 (no discrimination) to 1.0 (perfect discrimination), with 0.7-0.8 considered good and >0.8 considered excellent. Unlike time-dependent ROC which evaluates at specific time points, the C-index provides a global measure across the entire follow-up period. Supports time-dependent concordance for dynamic predictions, competing risks extensions, stratified analysis, and comparison between models. Essential for validating Cox regression models, machine learning survival predictions, and prognostic scores. The C-index accounts for censoring through appropriate pair weighting and can be decomposed to identify which risk groups contribute most to discrimination.

Usage

concordanceindex(
  data,
  time,
  event,
  event_code,
  predictor,
  reverse_direction = FALSE,
  predictor_formula = "",
  cindex_method = "harrell",
  time_dependent = FALSE,
  evaluation_times = "12, 36, 60",
  confidence_intervals = FALSE,
  ci_method = "bootstrap",
  bootstrap_samples = 500,
  confidence_level = 0.95,
  compare_models = FALSE,
  additional_predictors,
  model_names = "Model 1, Model 2, Model 3",
  compare_test = FALSE,
  competing_risks = FALSE,
  cause_specific = FALSE,
  decompose_cindex = FALSE,
  risk_groups = 3,
  somers_d = FALSE,
  goodman_kruskal_gamma = FALSE,
  stratified_cindex = FALSE,
  stratify_by,
  plot_cindex_over_time = FALSE,
  plot_model_comparison = FALSE,
  plot_risk_group_kaplan_meier = FALSE,
  plot_decomposition = FALSE,
  clinical_application = "general",
  show_interpretation = TRUE,
  external_validation = FALSE,
  restricted_time = FALSE,
  max_time = 60,
  handle_ties = "average",
  missing_handling = "complete",
  random_seed = 123
)

Arguments

data

The data as a data frame.

time

Time-to-event or censoring variable (days, months, years). For time-dependent concordance, this is the follow-up time.

event

Binary event indicator (1 = event, 0 = censored). For competing risks, can be multi-level factor with different event types.

event_code

Which level represents the event of interest (e.g., death, progression).

predictor

Continuous predictor variable used for ranking patients. Can be a risk score, linear predictor from Cox model, or predicted probability. Higher values should indicate higher risk (or use reverse_direction option).

reverse_direction

If predictor is protective (lower values = higher risk), reverse the direction for concordance calculation. Example: survival probability where lower values indicate worse prognosis.

predictor_formula

Formula for Cox model if predictor not directly provided. Example: "~ age + stage + grade". Model will be fitted and linear predictor used.

cindex_method

Method for C-index calculation. Harrell's is standard, Uno's uses inverse probability weighting for censoring, Gönen-Heller is bias-free estimator for proportional hazards models (doesn't require follow-up information).

time_dependent

Calculate time-dependent C-index at specific time horizons. Evaluates discrimination for predictions at particular time points rather than globally across follow-up.

evaluation_times

Comma-separated time points for time-dependent C-index evaluation. Example: "12, 24, 36, 60" for 1, 2, 3, and 5 years in months.

confidence_intervals

Calculate confidence intervals for C-index using bootstrap or asymptotic standard errors.

ci_method

Method for confidence interval estimation. Bootstrap is more accurate but computationally intensive.

bootstrap_samples

Number of bootstrap samples for CI estimation.

confidence_level

Confidence level for interval estimation.

compare_models

Compare C-index across multiple prediction models to identify best discriminating model.

additional_predictors

Additional predictor variables (risk scores) for model comparison.

model_names

Comma-separated names for models being compared.

compare_test

Perform statistical test for differences in C-index between models.

competing_risks

Use competing risks framework for C-index calculation when multiple event types can occur. Requires cause-specific hazard approach.

cause_specific

For competing risks, calculate cause-specific C-index for event of interest treating other events as censoring.

decompose_cindex

Decompose C-index to show contribution from different risk strata. Identifies which patient groups contribute most to discrimination.

risk_groups

Number of risk groups for C-index decomposition (e.g., 3 = low/medium/high, 4 = quartiles, 10 = deciles).

somers_d

Calculate Somers' D rank correlation (D = 2*(C-index - 0.5)). Ranges from -1 to +1, interpretable as correlation between predictor and outcome.

goodman_kruskal_gamma

Calculate Goodman-Kruskal gamma statistic for ordinal association. Related to C-index but uses different pair weighting.

stratified_cindex

Calculate C-index stratified by important subgroups to assess consistency across populations.

stratify_by

Variable for stratified analysis (e.g., treatment arm, center, age group).

plot_cindex_over_time

Plot time-dependent C-index across follow-up period. Shows how discrimination changes with prediction horizon.

plot_model_comparison

Bar plot comparing C-index across multiple models with confidence intervals.

plot_risk_group_kaplan_meier

Display Kaplan-Meier curves stratified by risk groups defined by predictor quantiles. Visual assessment of separation = discrimination.

plot_decomposition

Visualize contribution of different risk strata to overall C-index.

clinical_application

Clinical application context for interpretation guidance.

show_interpretation

Provide interpretation of C-index with clinical context and recommendations.

external_validation

Indicate this is external validation (model from different cohort).

restricted_time

Restrict C-index calculation to specific follow-up period to avoid issues with long-term censoring.

max_time

Maximum follow-up time for restricted C-index calculation. Pairs beyond this time are not considered.

handle_ties

Method for handling tied predictor values in concordance calculation.

missing_handling

Method for handling missing predictor or outcome data.

random_seed

Random seed for bootstrap and other stochastic procedures.

Value

A results object containing:

results$instructionsa html
results$cindexSummaryOverall C-index with confidence intervals
results$somersDSomers' D rank correlation
results$timeDependentCindexC-index evaluated at specific time points
results$modelComparisonComparison of C-index across multiple models
results$pairwiseTestsStatistical tests comparing C-index between models
results$decompositionContribution of risk strata to overall C-index
results$stratifiedCindexC-index by subgroup
results$cindexOverTimePlotTime-dependent C-index across follow-up
results$modelComparisonPlotBar plot comparing C-index across models
results$riskGroupKMPlotSurvival curves stratified by predictor quantiles
results$decompositionPlotContribution of risk strata to discrimination
results$interpretationa html

Tables can be converted to data frames with asDF or as.data.frame. For example:

results$cindexSummary$asDF

as.data.frame(results$cindexSummary)

Examples

result <- concordanceindex(
    data = validation_data,
    time = "follow_up_months",
    event = "death",
    predictor = "risk_score"
)
#> Error in concordanceindex(data = validation_data, time = "follow_up_months",     event = "death", predictor = "risk_score"): argument "stratify_by" is missing, with no default