Skip to contents

Performs LASSO-penalized logistic regression for variable selection in binary classification problems. Ideal for diagnostic pathology studies that build classifiers (e.g., tumor type A vs B) with automatic feature selection.

Usage

lassologistic(
  data,
  outcome,
  outcomeLevel,
  explanatory,
  penalty = "lasso",
  alpha = 0.5,
  lambda = "lambda.1se",
  nfolds = 10,
  random_seed = 123456,
  standardize = TRUE,
  suitabilityCheck = TRUE,
  bootstrapValidation = FALSE,
  bootstrapN = 200,
  cv_plot = TRUE,
  coef_plot = TRUE,
  roc_plot = TRUE,
  scoringSystem = FALSE,
  scoringMethod = "schneeweiss",
  scoringMaxPoints = 10,
  scoreLookupTable = TRUE,
  showSummary = FALSE,
  showExplanations = FALSE,
  showMethodologyNotes = FALSE,
  includeClinicalGuidance = FALSE,
  showVariableImportance = FALSE,
  showModelComparison = FALSE
)

Arguments

data

The data as a data frame.

outcome

Binary outcome variable to classify. Can be factor or numeric with exactly two observed values.

outcomeLevel

Level of outcome considered as the positive class (event). For factor outcomes, if left empty the second level is used.

explanatory

Variables to be considered for selection. At least 2 required. Constant variables are removed automatically.

penalty

Type of regularization penalty. LASSO (L1) produces sparse models by setting coefficients to zero. Ridge (L2) shrinks but retains all. Elastic Net combines both.

alpha

Mixing parameter for elastic net. Only used when penalty is elasticnet.

lambda

Method for selecting the optimal lambda from cross-validation. 1SE rule tends to select fewer variables (more conservative).

nfolds

Number of folds for cross-validation. Automatically reduced when sample size is limited.

random_seed

Random seed for reproducible cross-validation fold assignment.

standardize

Whether to standardize predictor variables before fitting.

suitabilityCheck

Check sample size, events-per-variable ratio, multicollinearity, and whether regularization is needed.

bootstrapValidation

Perform bootstrap optimism-corrected validation (Harrell method). Reports corrected AUC, calibration slope, and Brier score.

bootstrapN

Number of bootstrap iterations for internal validation.

cv_plot

Show cross-validation deviance vs lambda plot.

coef_plot

Show bar plot of selected variable coefficients.

roc_plot

Show ROC curve with AUC and confidence interval.

scoringSystem

Convert model coefficients to an integer-point scoring system. Multiple methods available following published standards.

scoringMethod

Method for converting regression coefficients to integer points. Beta10: multiply by 10, round (Zhang et al. 2017). Schneeweiss: divide by smallest absolute coefficient (Mehta et al. 2016). Sullivan: reference-variable scaling (Sullivan et al. 2004, Framingham method). Compare: generate all three and show performance comparison.

scoringMaxPoints

Maximum points assigned to the strongest predictor (Beta10 method).

scoreLookupTable

Generate a lookup table mapping each possible total score to predicted probability of the positive class.

showSummary

Display a natural-language summary of the main results.

showExplanations

Display explanations of LASSO logistic regression methodology.

showMethodologyNotes

Show technical notes about regularization and validation.

includeClinicalGuidance

Include guidance for clinical interpretation of results.

showVariableImportance

Display variable importance rankings across lambda values.

showModelComparison

Compare LASSO vs standard logistic using selected variables.

Value

A results object containing:

results$todoa html
results$suitabilityReporta html
results$modelSummarya table
results$coefficientsa table
results$performancea table
results$scoringTablea table
results$scoringPerformancea table
results$methodComparisona table
results$lookupTablea table
results$validationTablea table
results$cv_plotan image
results$coef_plotan image
results$roc_plotan image
results$predictionsan output
results$summaryTexta html
results$lassoExplanationa html
results$methodologyNotesa html
results$clinicalGuidancea html
results$variableImportancea table
results$modelComparisona table

Tables can be converted to data frames with asDF or as.data.frame. For example:

results$modelSummary$asDF

as.data.frame(results$modelSummary)

Examples

# \donttest{
lassologistic(data = data, outcome = "diagnosis",
    outcomeLevel = "PanNEC", explanatory = vars(p53, Rb1, Ki67, SSTR2A))
#> Error: Argument 'data' must be a data frame
# }