Essential IHC expression analysis with clustering and H-score calculation. Designed for routine clinical use with simplified options.
Usage
ihcbasic(
data,
markers,
id = NULL,
computeHScore = FALSE,
clusterMethod = "hierarchical",
nClusters = 3,
distanceMetric = "gower",
linkageMethod = "ward.D2",
standardizeData = TRUE,
showDendrogram = TRUE,
showHeatmap = TRUE,
silhouetteAnalysis = TRUE,
scoringScale = "standard"
)Arguments
- data
the data as a data frame
- markers
IHC marker variables (categorical 0-3 or continuous H-scores)
- id
Optional case/sample identifier for labeling
- computeHScore
Generate H-score statistics (mean, median, range) for each marker. H-scores represent staining intensity multiplied by percentage of positive cells, providing a comprehensive measure of protein expression (range 0-300).
- clusterMethod
Method for clustering samples by IHC expression patterns. Hierarchical clustering builds tree-like relationships and works well for small to medium datasets. K-means is faster for large datasets but assumes spherical clusters. PAM (Partitioning Around Medoids) is more robust to outliers.
- nClusters
Number of distinct IHC expression patterns to identify. Start with 2-3 patterns for initial exploration. More patterns may reveal tumor heterogeneity but require larger sample sizes (rule of thumb: at least 3-5 samples per pattern).
- distanceMetric
How to measure similarity between samples. Gower distance works with different IHC scoring scales (0-3, H-scores, binary) and is recommended for most analyses. Jaccard distance only considers presence/absence of staining.
- linkageMethod
How to form groups in hierarchical clustering. Ward's method creates well-balanced groups and is recommended for most IHC analyses. Complete linkage creates tight, compact clusters. Average linkage provides a moderate approach between the two.
- standardizeData
Recommended when markers use different scales (e.g., mixing 0-3 scores with H-scores). Puts all markers on the same scale so no single marker dominates the analysis. Disable for binary data or when all markers use the same scale.
- showDendrogram
Shows the hierarchical relationship between samples as a tree diagram. Helps visualize how expression patterns relate to each other and identify the optimal number of groups.
- showHeatmap
Visual representation of IHC staining patterns across all samples. Each row represents a patient sample, each column represents a marker. Colors indicate expression levels, with samples grouped by similar patterns.
- silhouetteAnalysis
Recommended: Evaluates how well-separated the expression patterns are. Higher scores indicate more reliable, clinically meaningful patterns. Helps determine if the identified patterns are real or due to noise.
- scoringScale
The IHC scoring system used in your data. Standard 0-3 scale is most common (0=no staining, 1=weak, 2=moderate, 3=strong staining). Binary scoring records only presence/absence. H-score combines intensity and percentage of positive cells (range 0-300).
Value
A results object containing:
results$instructions | a html | ||||
results$clinicalSummary | Plain-language summary with clinical context | ||||
results$reportText | Pre-formatted text for clinical reports | ||||
results$clusterSummary | a table | ||||
results$hscoreTable | a table | ||||
results$silhouetteTable | a table | ||||
results$markerSummary | a table | ||||
results$dendrogramPlot | an image | ||||
results$heatmapPlot | an image |
Tables can be converted to data frames with asDF or as.data.frame. For example:
results$clusterSummary$asDF
as.data.frame(results$clusterSummary)