Skip to contents

Usage

biomarkerdiscovery(
  data,
  outcome_var,
  biomarker_vars,
  clinical_vars,
  batch_var,
  patient_id,
  time_var,
  event_var,
  discovery_method = "elastic_net",
  outcome_type = "binary",
  data_type = "genomics",
  data_preprocessing = TRUE,
  normalization_method = "z_score",
  batch_correction = FALSE,
  batch_method = "combat",
  filter_low_variance = TRUE,
  variance_threshold = 0.1,
  feature_selection_method = "univariate_stats",
  n_features_select = 50,
  fdr_threshold = 0.05,
  correlation_threshold = 0.9,
  validation_method = "cv_10fold",
  train_proportion = 0.7,
  hyperparameter_tuning = TRUE,
  n_bootstrap_samples = 1000,
  random_seed = 42,
  biomarker_ranking = TRUE,
  stability_analysis = TRUE,
  clinical_performance = TRUE,
  signature_development = TRUE,
  pathway_analysis = FALSE,
  interpretability = TRUE,
  shap_analysis = TRUE,
  lime_analysis = FALSE,
  feature_interaction = TRUE,
  partial_dependence = TRUE,
  biomarker_networks = FALSE,
  cutpoint_optimization = TRUE,
  risk_stratification = TRUE,
  nomogram_development = TRUE,
  decision_curve_analysis = TRUE,
  external_validation = TRUE,
  biomarker_generalizability = TRUE,
  robustness_testing = TRUE,
  quality_control = TRUE,
  outlier_detection = TRUE,
  missing_data_analysis = TRUE,
  detailed_results = TRUE,
  biomarker_report = TRUE,
  export_biomarkers = FALSE,
  save_signature = FALSE,
  regulatory_documentation = TRUE
)

Arguments

data

the data as a data frame

outcome_var

Primary outcome variable for biomarker discovery

biomarker_vars

Potential biomarker variables (genes, proteins, metabolites, etc.)

clinical_vars

Clinical variables to include in the analysis (age, stage, etc.)

batch_var

Batch or study identifier for batch effect correction

patient_id

Patient identifier for tracking

time_var

Time to event variable for survival biomarker analysis

event_var

Event indicator for survival biomarker analysis

discovery_method

Method for biomarker discovery and selection

outcome_type

Type of outcome variable

data_type

Type of biomarker data being analyzed

data_preprocessing

Perform data preprocessing and normalization

normalization_method

Method for data normalization

batch_correction

Perform batch effect correction

batch_method

Method for batch effect correction

filter_low_variance

Remove features with low variance

variance_threshold

Minimum variance threshold for feature filtering

feature_selection_method

Method for initial feature selection

n_features_select

Maximum number of features to select for analysis

fdr_threshold

False discovery rate threshold for multiple testing correction

correlation_threshold

Correlation threshold for removing highly correlated features

validation_method

Method for model validation

train_proportion

Proportion of data for training (70\

hyperparameter_tuningPerform hyperparameter optimization

n_bootstrap_samplesNumber of bootstrap samples for confidence intervals

random_seedRandom seed for reproducibility

biomarker_rankingRank biomarkers by importance and clinical relevance

stability_analysisAssess biomarker selection stability across resampling

clinical_performanceCalculate clinical performance metrics for biomarkers

signature_developmentDevelop multi-biomarker signatures

pathway_analysisPerform pathway enrichment analysis for discovered biomarkers

interpretabilityGenerate interpretability analysis using SHAP/LIME

shap_analysisGenerate SHAP values for biomarker explanation

lime_analysisGenerate LIME explanations for individual predictions

feature_interactionAnalyze interactions between biomarkers

partial_dependenceGenerate partial dependence plots for key biomarkers

biomarker_networksAnalyze biomarker co-expression and interaction networks

cutpoint_optimizationFind optimal cutpoints for biomarker classification

risk_stratificationCreate risk stratification based on biomarker signatures

nomogram_developmentDevelop clinical nomogram incorporating biomarkers

decision_curve_analysisAssess clinical utility using decision curve analysis

external_validationPrepare biomarkers for external validation

biomarker_generalizabilityAssess biomarker generalizability across populations

robustness_testingTest biomarker robustness to data perturbations

quality_controlComprehensive quality control for biomarker data

outlier_detectionDetect and handle outliers in biomarker data

missing_data_analysisAnalyze and handle missing biomarker data

detailed_resultsInclude comprehensive analysis results

biomarker_reportGenerate comprehensive biomarker discovery report

export_biomarkersExport list of discovered biomarkers

save_signatureSave trained biomarker signature model

regulatory_documentationInclude documentation for regulatory submission

A results object containing:

results$discovery_overviewa table
results$data_summarya table
results$quality_control_summarya table
results$outlier_analysisa table
results$feature_selection_summarya table
results$selected_biomarkersa table
results$discovery_performancea table
results$signature_performancea table
results$biomarker_rankinga table
results$stability_analysis_resultsa table
results$shap_biomarker_importancea table
results$biomarker_interactionsa table
results$optimal_cutpointsa table
results$risk_stratification_resultsa table
results$decision_curve_resultsa table
results$pathway_enrichmenta table
results$cross_validation_resultsa table
results$generalizability_assessmenta table
results$clinical_interpretationa table
results$regulatory_summarya table
results$biomarker_importance_plotan image
results$roc_comparison_plotan image
results$shap_summary_plotan image
results$shap_dependence_plotan image
results$stability_plotan image
results$biomarker_correlation_plotan image
results$risk_stratification_plotan image
results$decision_curve_plotan image
results$pathway_network_plotan image
results$biomarker_distribution_plotan image
results$nomogram_plotan image
Tables can be converted to data frames with asDF or as.data.frame. For example:results$discovery_overview$asDFas.data.frame(results$discovery_overview) Comprehensive biomarker discovery and validation platform using machine learning algorithms with interpretability analysis. Includes feature selection, biomarker ranking, pathway analysis, and clinical validation metrics. Designed for omics data analysis with regulatory compliance features for biomarker development and clinical translation. data('biomarker_data')biomarkerdiscovery( data = biomarker_data, outcome_var = "response", biomarker_vars = c("gene1", "gene2", "protein1"), discovery_method = "elastic_net", validation_method = "bootstrap", interpretability = TRUE )