Essential IHC expression analysis with clustering and H-score calculation. Designed for routine clinical use with simplified options.
Usage
ihcbasic(
  data,
  markers,
  id = NULL,
  computeHScore = FALSE,
  clusterMethod = "hierarchical",
  nClusters = 3,
  distanceMetric = "gower",
  linkageMethod = "ward.D2",
  standardizeData = TRUE,
  showDendrogram = TRUE,
  showHeatmap = TRUE,
  silhouetteAnalysis = TRUE,
  scoringScale = "standard"
)Arguments
- data
- the data as a data frame 
- markers
- IHC marker variables (categorical 0-3 or continuous H-scores) 
- id
- Optional case/sample identifier for labeling 
- computeHScore
- Generate H-score statistics (mean, median, range) for each marker. H-scores represent staining intensity multiplied by percentage of positive cells, providing a comprehensive measure of protein expression (range 0-300). 
- clusterMethod
- Method for clustering samples by IHC expression patterns. Hierarchical clustering builds tree-like relationships and works well for small to medium datasets. K-means is faster for large datasets but assumes spherical clusters. PAM (Partitioning Around Medoids) is more robust to outliers. 
- nClusters
- Number of distinct IHC expression patterns to identify. Start with 2-3 patterns for initial exploration. More patterns may reveal tumor heterogeneity but require larger sample sizes (rule of thumb: at least 3-5 samples per pattern). 
- distanceMetric
- How to measure similarity between samples. Gower distance works with different IHC scoring scales (0-3, H-scores, binary) and is recommended for most analyses. Jaccard distance only considers presence/absence of staining. 
- linkageMethod
- How to form groups in hierarchical clustering. Ward's method creates well-balanced groups and is recommended for most IHC analyses. Complete linkage creates tight, compact clusters. Average linkage provides a moderate approach between the two. 
- standardizeData
- Recommended when markers use different scales (e.g., mixing 0-3 scores with H-scores). Puts all markers on the same scale so no single marker dominates the analysis. Disable for binary data or when all markers use the same scale. 
- showDendrogram
- Shows the hierarchical relationship between samples as a tree diagram. Helps visualize how expression patterns relate to each other and identify the optimal number of groups. 
- showHeatmap
- Visual representation of IHC staining patterns across all samples. Each row represents a patient sample, each column represents a marker. Colors indicate expression levels, with samples grouped by similar patterns. 
- silhouetteAnalysis
- Recommended: Evaluates how well-separated the expression patterns are. Higher scores indicate more reliable, clinically meaningful patterns. Helps determine if the identified patterns are real or due to noise. 
- scoringScale
- The IHC scoring system used in your data. Standard 0-3 scale is most common (0=no staining, 1=weak, 2=moderate, 3=strong staining). Binary scoring records only presence/absence. H-score combines intensity and percentage of positive cells (range 0-300). 
Value
A results object containing:
| results$instructions | a html | ||||
| results$clinicalSummary | Plain-language summary with clinical context | ||||
| results$reportText | Pre-formatted text for clinical reports | ||||
| results$clusterSummary | a table | ||||
| results$hscoreTable | a table | ||||
| results$silhouetteTable | a table | ||||
| results$markerSummary | a table | ||||
| results$dendrogramPlot | an image | ||||
| results$heatmapPlot | an image | 
Tables can be converted to data frames with asDF or as.data.frame. For example:
results$clusterSummary$asDF
as.data.frame(results$clusterSummary)