Test Dataset for Chi-Square Post-Hoc Analysis
chisqposttest_test_data.RdA comprehensive test dataset specifically designed for testing the chisqposttest function. Contains multiple categorical variables with known associations of different strengths, edge cases, and missing data patterns.
Usage
data("chisqposttest_test_data")Format
A data frame with 300 observations and 14 variables:
- PatientID
Patient identifier (1-300)
- Treatment
Treatment group: "Standard", "Experimental"
- Response
Treatment response: "No Response", "Response" (strongly associated with Treatment)
- Sex
Patient sex: "Male", "Female" (balanced)
- TumorGrade
Tumor grade: "Grade 1", "Grade 2", "Grade 3"
- TumorStage
Tumor stage: "Stage I", "Stage II", "Stage III" (moderately associated with TumorGrade)
- Institution
Hospital: "Hospital A", "Hospital B", "Hospital C", "Hospital D"
- QualityScore
Quality rating: "High", "Low" (weakly associated with Institution)
- RandomVar1
Random variable: "Group A", "Group B", "Group C" (no associations)
- RandomVar2
Random variable: "Type X", "Type Y" (no associations)
- RareCategory
Frequency category: "Common", "Uncommon", "Rare" (unbalanced)
- BinaryOutcome
Binary outcome: "Negative", "Positive" (associated with RareCategory)
- AgeGroup
Age category: "Young", "Middle", "Elderly"
- BiomarkerStatus
Biomarker status: "Negative", "Positive" (moderately associated with AgeGroup)
Details
This dataset contains several types of associations designed to test different aspects of chi-square post-hoc analysis:
Strong Associations:
Treatment -> Response: Clear treatment effect with odds ratio ~5
Moderate Associations:
TumorGrade -> TumorStage: Higher grades associated with advanced stages
AgeGroup -> BiomarkerStatus: Age-related biomarker expression pattern
Weak Associations:
Institution -> QualityScore: Institutional quality differences
RareCategory -> BinaryOutcome: Effect in rare category with small cell counts
No Associations:
RandomVar1 ⊥ RandomVar2: Independent random variables for null hypothesis testing
The dataset includes approximately 5% missing data in Treatment, Sex, and TumorGrade variables to test missing data handling options.
Source
Simulated data created for testing purposes. Associations are based on realistic clinical scenarios but data is artificially generated.
See also
chisqposttest, histopathology
Examples
# Load the dataset
data(chisqposttest_test_data)
# Examine structure
str(chisqposttest_test_data)
# Example 1: Strong association (should be highly significant)
chisqposttest(
data = chisqposttest_test_data,
rows = "Treatment",
cols = "Response",
posthoc = "bonferroni"
)
# Example 2: Moderate association (should be significant with post-hoc differences)
chisqposttest(
data = chisqposttest_test_data,
rows = "TumorGrade",
cols = "TumorStage",
posthoc = "fdr"
)
# Example 3: No association (should be non-significant)
chisqposttest(
data = chisqposttest_test_data,
rows = "RandomVar1",
cols = "RandomVar2",
posthoc = "bonferroni"
)
# Example 4: Edge case with rare categories
chisqposttest(
data = chisqposttest_test_data,
rows = "RareCategory",
cols = "BinaryOutcome",
posthoc = "fdr"
)
# Example 5: Missing data handling
chisqposttest(
data = chisqposttest_test_data,
rows = "Treatment",
cols = "Sex",
excl = TRUE # Exclude missing values
)