Biomarker Discovery Platform with ML Interpretability
Source:R/biomarkerdiscovery.h.R
biomarkerdiscovery.RdUsage
biomarkerdiscovery(
data,
outcome_var,
biomarker_vars,
clinical_vars,
batch_var,
patient_id,
time_var,
event_var,
discovery_method = "elastic_net",
outcome_type = "binary",
data_type = "genomics",
data_preprocessing = TRUE,
normalization_method = "z_score",
batch_correction = FALSE,
batch_method = "combat",
filter_low_variance = TRUE,
variance_threshold = 0.1,
feature_selection_method = "univariate_stats",
n_features_select = 50,
fdr_threshold = 0.05,
correlation_threshold = 0.9,
validation_method = "cv_10fold",
train_proportion = 0.7,
hyperparameter_tuning = TRUE,
n_bootstrap_samples = 1000,
random_seed = 42,
biomarker_ranking = TRUE,
stability_analysis = TRUE,
clinical_performance = TRUE,
signature_development = TRUE,
pathway_analysis = FALSE,
interpretability = TRUE,
shap_analysis = TRUE,
lime_analysis = FALSE,
feature_interaction = TRUE,
partial_dependence = TRUE,
biomarker_networks = FALSE,
cutpoint_optimization = TRUE,
risk_stratification = TRUE,
nomogram_development = TRUE,
decision_curve_analysis = TRUE,
external_validation = TRUE,
biomarker_generalizability = TRUE,
robustness_testing = TRUE,
quality_control = TRUE,
outlier_detection = TRUE,
missing_data_analysis = TRUE,
detailed_results = TRUE,
biomarker_report = TRUE,
export_biomarkers = FALSE,
save_signature = FALSE,
regulatory_documentation = TRUE
)Arguments
- data
the data as a data frame
- outcome_var
Primary outcome variable for biomarker discovery
- biomarker_vars
Potential biomarker variables (genes, proteins, metabolites, etc.)
- clinical_vars
Clinical variables to include in the analysis (age, stage, etc.)
- batch_var
Batch or study identifier for batch effect correction
- patient_id
Patient identifier for tracking
- time_var
Time to event variable for survival biomarker analysis
- event_var
Event indicator for survival biomarker analysis
- discovery_method
Method for biomarker discovery and selection
- outcome_type
Type of outcome variable
- data_type
Type of biomarker data being analyzed
- data_preprocessing
Perform data preprocessing and normalization
- normalization_method
Method for data normalization
- batch_correction
Perform batch effect correction
- batch_method
Method for batch effect correction
- filter_low_variance
Remove features with low variance
- variance_threshold
Minimum variance threshold for feature filtering
- feature_selection_method
Method for initial feature selection
- n_features_select
Maximum number of features to select for analysis
- fdr_threshold
False discovery rate threshold for multiple testing correction
- correlation_threshold
Correlation threshold for removing highly correlated features
- validation_method
Method for model validation
- train_proportion
Proportion of data for training (70\
hyperparameter_tuningPerform hyperparameter optimization
n_bootstrap_samplesNumber of bootstrap samples for confidence intervals
random_seedRandom seed for reproducibility
biomarker_rankingRank biomarkers by importance and clinical relevance
stability_analysisAssess biomarker selection stability across resampling
clinical_performanceCalculate clinical performance metrics for biomarkers
signature_developmentDevelop multi-biomarker signatures
pathway_analysisPerform pathway enrichment analysis for discovered biomarkers
interpretabilityGenerate interpretability analysis using SHAP/LIME
shap_analysisGenerate SHAP values for biomarker explanation
lime_analysisGenerate LIME explanations for individual predictions
feature_interactionAnalyze interactions between biomarkers
partial_dependenceGenerate partial dependence plots for key biomarkers
biomarker_networksAnalyze biomarker co-expression and interaction networks
cutpoint_optimizationFind optimal cutpoints for biomarker classification
risk_stratificationCreate risk stratification based on biomarker signatures
nomogram_developmentDevelop clinical nomogram incorporating biomarkers
decision_curve_analysisAssess clinical utility using decision curve analysis
external_validationPrepare biomarkers for external validation
biomarker_generalizabilityAssess biomarker generalizability across populations
robustness_testingTest biomarker robustness to data perturbations
quality_controlComprehensive quality control for biomarker data
outlier_detectionDetect and handle outliers in biomarker data
missing_data_analysisAnalyze and handle missing biomarker data
detailed_resultsInclude comprehensive analysis results
biomarker_reportGenerate comprehensive biomarker discovery report
export_biomarkersExport list of discovered biomarkers
save_signatureSave trained biomarker signature model
regulatory_documentationInclude documentation for regulatory submission
A results object containing:
results$discovery_overview | a table | ||||
results$data_summary | a table | ||||
results$quality_control_summary | a table | ||||
results$outlier_analysis | a table | ||||
results$feature_selection_summary | a table | ||||
results$selected_biomarkers | a table | ||||
results$discovery_performance | a table | ||||
results$signature_performance | a table | ||||
results$biomarker_ranking | a table | ||||
results$stability_analysis_results | a table | ||||
results$shap_biomarker_importance | a table | ||||
results$biomarker_interactions | a table | ||||
results$optimal_cutpoints | a table | ||||
results$risk_stratification_results | a table | ||||
results$decision_curve_results | a table | ||||
results$pathway_enrichment | a table | ||||
results$cross_validation_results | a table | ||||
results$generalizability_assessment | a table | ||||
results$clinical_interpretation | a table | ||||
results$regulatory_summary | a table | ||||
results$biomarker_importance_plot | an image | ||||
results$roc_comparison_plot | an image | ||||
results$shap_summary_plot | an image | ||||
results$shap_dependence_plot | an image | ||||
results$stability_plot | an image | ||||
results$biomarker_correlation_plot | an image | ||||
results$risk_stratification_plot | an image | ||||
results$decision_curve_plot | an image | ||||
results$pathway_network_plot | an image | ||||
results$biomarker_distribution_plot | an image | ||||
results$nomogram_plot | an image |
asDF or as.data.frame. For example:results$discovery_overview$asDFas.data.frame(results$discovery_overview)
Comprehensive biomarker discovery and validation platform using machine
learning
algorithms with interpretability analysis. Includes feature selection,
biomarker
ranking, pathway analysis, and clinical validation metrics. Designed for
omics
data analysis with regulatory compliance features for biomarker development
and clinical translation.
data('biomarker_data')biomarkerdiscovery(
data = biomarker_data,
outcome_var = "response",
biomarker_vars = c("gene1", "gene2", "protein1"),
discovery_method = "elastic_net",
validation_method = "bootstrap",
interpretability = TRUE
)