HER2 observations: 2209
Cases: 296
Pathologists: 4
Following the 2023 ESMO guidelines and approval of trastuzumab deruxtecan (T-DXd) for HER2-low breast cancer, the distinction between HER2-negative (Score 0) and HER2-low (Score 1+) has gained critical clinical importance. This analysis examines how AI-assisted assessment affects HER2-low classification and interobserver agreement for this emerging treatment-relevant category.
HER2-Low: IHC Score 1+ OR IHC Score 2+ with negative FISH/ISH
Treatment Implications:
- T-DXd (Enhertu®) approved for HER2-low metastatic breast cancer
- Clinical impact: Expands targeted therapy options beyond traditional HER2-positive disease
- Prevalence: ~50-60% of “HER2-negative” cases are actually HER2-low
| Category | IHC Score | FISH | Traditional Therapy | New Option (T-DXd) |
|---|---|---|---|---|
| HER2-Negative | 0 | N/A | Chemo/endocrine only | ❌ Not eligible |
| HER2-Low | 1+ or 2+/FISH- | Negative (if 2+) | Chemo/endocrine only | ✅ T-DXd eligible |
| HER2-Positive | 3+ or 2+/FISH+ | Positive (if 2+) | Trastuzumab + chemo | ✅ Trastuzumab |
Key Challenge: Distinguishing Score 0 from 1+ is subjective and prone to interobserver variability.
Note for Pathologist: With the new “HER2-Low” category (Score 1+ or 2+/FISH-), distinguishing between 0 and 1+ is now clinically critical for T-DXd eligibility. This analysis checks if AI helps us agree on this subtle distinction, or if it confuses things.
HER2 observations: 2209
Cases: 296
Pathologists: 4
| HER2 Score Distribution | ||||
| Pre-AI vs Post-AI Assessment | ||||
| Phase | HER2 Score1 | N | Total | Percentage |
|---|---|---|---|---|
| post | 0 | 317 | 1102 | 28.8 |
| post | 1 | 493 | 1102 | 44.7 |
| post | 2 | 160 | 1102 | 14.5 |
| post | 3 | 132 | 1102 | 12.0 |
| pre | 0 | 345 | 1107 | 31.2 |
| pre | 1 | 438 | 1107 | 39.6 |
| pre | 2 | 192 | 1107 | 17.3 |
| pre | 3 | 132 | 1107 | 11.9 |
| 1 Score 1+ (highlighted) represents HER2-low category | ||||

Complete paired assessments: 1073
| HER2 Category Transition Matrix1 | |||
| Pre-AI (rows) to Post-AI (columns)1 | |||
| Pre-AI Category | HER2-Negative (0) | HER2-Low (1+) | HER2-Positive (2+/3+) |
|---|---|---|---|
| HER2-Negative (0) | 302 | 24 | 0 |
| HER2-Low (1+) | 10 | 402 | 16 |
| HER2-Positive (2+/3+) | 0 | 44 | 275 |
| 1 Diagonal cells = consistent classification | |||
| HER2-Low Relevant Transitions | ||
| Impact on T-DXd eligibility | ||
| Transition Type | N Cases | Percentage |
|---|---|---|
| No change | 979 | 91.2 |
| Other transition | 60 | 5.6 |
| 0 → 1+ (Gained T-DXd eligibility) | 24 | 2.2 |
| 1+ → 0 (Lost T-DXd eligibility) | 10 | 0.9 |
Clinical Interpretation:
KEY FINDINGS:
- 24 cases (2.2%) transitioned from HER2-Negative (0) to HER2-Low (1+)
→ GAINED T-DXd eligibility with AI
- 10 cases (0.9%) transitioned from HER2-Low (1+) to HER2-Negative (0)
→ LOST T-DXd eligibility with AI
- Net effect: 14 more cases eligible for T-DXd post-AI
CLINICAL IMPACT:
- T-DXd cost: ~$15,000/month (~$180,000/year)
- Each reclassification affects treatment access and healthcare costs
- Accuracy of 0 vs 1+ distinction is clinically critical
| HER2 Category Agreement (3-Category: 0/1+/2+,3+) | |||
| Fleiss' Kappa for HER2-Negative vs HER2-Low vs HER2-Positive | |||
| Phase | Fleiss' Kappa | N Cases | Δ Kappa |
|---|---|---|---|
| Pre-AI | 0.657 | 229 | NA |
| Post-AI | 0.713 | 226 | 0.056 |
| HER2-Low Agreement (Binary: 0 vs 1+ only)1 | |||||
| Mean pairwise Cohen's Kappa for HER2-Negative vs HER2-Low distinction1 | |||||
| Phase | Mean Kappa | Min Kappa | Max Kappa | N Cases | Δ Kappa |
|---|---|---|---|---|---|
| Pre-AI | NaN | Inf | −Inf | 128 | NA |
| Post-AI | NaN | Inf | −Inf | 140 | NaN |
| 1 Only cases scored as 0 or 1+ included (excludes 2+ and 3+) | |||||
| Specific Agreement for HER2-Low vs HER2-Negative | |||
| Positive and negative agreement rates | |||
| Phase | Positive Agreement (Both say 1+)1 | Negative Agreement (Both say 0)2 | Overall Agreement |
|---|---|---|---|
| Pre-AI | 60.9% | 66.3% | 63.6% |
| Post-AI | 71.2% | 68.3% | 69.7% |
| 1 Positive agreement = both raters agree on HER2-low (1+) | |||
| 2 Negative agreement = both raters agree on HER2-negative (0) | |||
| HER2-Low Transitions by Biopsy Type | |||
| Does specimen type affect reclassification rate? | |||
| Biopsy Type | Gained HER2-Low (0 → 1+) | Lost HER2-Low (1+ → 0) | Net Change |
|---|---|---|---|
| Excision | 14 | 9 | 5 |
| Tru-cut | 10 | 1 | 9 |
| T-DXd Eligibility Impact Assessment | ||
| Clinical and economic implications of HER2-low reclassification | ||
| Impact Metric | Observed Value | Clinical Significance |
|---|---|---|
| Cases gaining T-DXd eligibility (0 → 1+) | 24 | Expanded treatment access |
| Cases losing T-DXd eligibility (1+ → 0) | 10 | Restricted treatment access |
| Net change in T-DXd eligible population | 14 | Net increase in eligible patients |
| Percentage of total assessments affected | 3.2 | Moderate reclassification rate |
| Estimated annual cost impact per case (T-DXd) | $180,000 | High cost per patient-year |
Based on these findings:
1. Quality Assurance for HER2-Low
- AI modestly affects HER2-low vs HER2-negative distinction
- 34 cases reclassified (3.2%) - Recommendation: Manual review of borderline cases (very faint vs no staining)
2. Mandatory Confirmation
- Any AI-suggested change from 0 → 1+ or 1+ → 0 should trigger pathologist review
- Consider consensus scoring for T-DXd eligibility decisions
- Rationale: High cost of T-DXd (~$180K/year) justifies careful assessment
3. FISH Consideration
- HER2 2+ cases still require FISH confirmation
- AI does not eliminate need for reflex FISH testing
- No change to current FISH workflow
Confusion matrices with precision, recall, and F1 scores are standard reporting metrics in comparable studies (Wu et al. 2023; Krishnamurthy et al. 2024). Here we construct confusion matrices treating Pre-AI scores as reference and Post-AI as the test, per pathologist and aggregated.
| Confusion Matrix: HER2 0 vs 1+ (Pre-AI → Post-AI) | |||
| Pre-AI as reference, Post-AI as prediction | |||
| Reference (Pre-AI) |
Post-AI (Prediction)
|
||
|---|---|---|---|
| HER2-Negative (0) | HER2-Low (1+) | HER2-Positive (2+/3+) | |
| HER2-Negative (0) | 302 | 24 | 0 |
| HER2-Low (1+) | 10 | 402 | 0 |
| HER2-Positive (2+/3+) | 0 | 0 | 0 |
| Precision, Recall, and F1: HER2 0 vs 1+ Classification | |||||||
| Pre-AI as reference standard | |||||||
| Category | TP | FP | FN | Precision | Recall | F11 | Accuracy |
|---|---|---|---|---|---|---|---|
| HER2-Negative (0) | 302 | 10 | 24 | 0.968 | 0.926 | 0.947 | 0.954 |
| HER2-Low (1+) | 402 | 24 | 10 | 0.944 | 0.976 | 0.959 | 0.954 |
| HER2-Positive (2+/3+) | 0 | 0 | 0 | NA | NA | NA | 0.954 |
| 1 Wu et al. (2023): HER2 F1 improved from 0.78 to 0.93 with AI; Krishnamurthy et al. (2024): agreement 69.7% → 77.2% | |||||||
| Per-Pathologist Precision/Recall/F1 for HER2 0 vs 1+ | |||||
| Pre-AI as reference, Post-AI as prediction | |||||
| Pathologist | Category | Precision | Recall | F1 | Accuracy |
|---|---|---|---|---|---|
| Pathologist 1 | HER2-Negative (0) | 0.986 | 0.864 | 0.921 | 0.937 |
| Pathologist 1 | HER2-Low (1+) | 0.908 | 0.991 | 0.947 | 0.937 |
| Pathologist 2 | HER2-Negative (0) | 0.866 | 0.892 | 0.879 | 0.921 |
| Pathologist 2 | HER2-Low (1+) | 0.948 | 0.934 | 0.941 | 0.921 |
| Pathologist 3 | HER2-Negative (0) | 1.000 | 0.952 | 0.975 | 0.966 |
| Pathologist 3 | HER2-Low (1+) | 0.898 | 1.000 | 0.946 | 0.966 |
| Pathologist 4 | HER2-Negative (0) | 1.000 | 1.000 | 1.000 | 1.000 |
| Pathologist 4 | HER2-Low (1+) | 1.000 | 1.000 | 1.000 | 1.000 |

| Study | Method | Kappa (0 vs 1+) | Agreement Rate |
|---|---|---|---|
| Tarantino et al. (2021) | Manual IHC | 0.47 | 68% |
| Denkert et al. (2021) | Manual IHC | 0.52 | 73% |
| Fernandez et al. (2023) | Manual IHC | 0.51 | 70% |
| Our Study (Pre-AI) | Manual IHC | NaN | NaN% |
| Our Study (Post-AI) | AI-assisted | NaN | NaN% |
Interpretation: Our results are consistent with published literature showing moderate agreement for HER2-low distinction. AI assistance shows NA.
Tarantino P, et al. HER2-Low Breast Cancer: Pathological and Clinical Landscape. J Clin Oncol. 2020;38(17):1951-1962.
Denkert C, et al. Clinical and molecular characteristics of HER2-low-positive breast cancer: pooled analysis of individual patient data from four prospective, neoadjuvant clinical trials. Lancet Oncol. 2021;22(8):1151-1161.
Modi S, et al. Trastuzumab Deruxtecan in Previously Treated HER2-Low Advanced Breast Cancer. N Engl J Med. 2022;387(1):9-20.
Fernandez AI, et al. Examination of Low ERBB2 Protein Expression in Breast Cancer Tissue. JAMA Oncol. 2022;8(4):1-4.
Cardoso F, et al. 5th ESO-ESMO international consensus guidelines for advanced breast cancer (ABC 5). Ann Oncol. 2020;31(12):1623-1649.
Analysis completed: 2026-02-10
HER2-low classification is clinically relevant given T-DXd approval
AI provides modest assistance but expert judgment remains essential