12  Ki-67 Threshold Analysis: Clinical Impact of AI-Induced Changes

12.1 Objective

Ki-67 proliferation index is a critical biomarker for breast cancer classification and treatment decisions, with clinical thresholds at:
- 10%: Distinguishing very low proliferation
- 20%: Luminal A vs Luminal B boundary (primary clinical decision point)
- 30%: High proliferation cutoff (chemotherapy consideration)

Preliminary analysis shows AI systematically increases Ki-67 values by ~5.9% on average, with the largest impact in the 20-30% range. This analysis quantifies:
1. How many cases cross clinical thresholds due to AI?
2. What is the magnitude and direction of threshold crossings?
3. How does this affect molecular subtype classification?
4. What are the treatment implications?

Note for Pathologist: Ki67 is the most variable marker. We know AI tends to “count higher” than humans. The critical question is: Does this extra counting push patients over the 20% or 30% thresholds unnecessarily, leading to potential overtreatment (chemo)? Or does it correctly identify “hot spots” we missed?

12.2 Setup

12.3 Load Data

Total cases with paired Ki-67 data: 294 
Mean Ki-67 change: 5.86 %
Median Ki-67 change: 5 %

12.4 Threshold Crossing Analysis

Define clinical thresholds and identify cases that cross them.

Ki-67 Threshold Crossing Analysis
Cases crossing clinical thresholds due to AI assistance
Threshold (%) Crossing Direction N Cases Proportion
10 Downward (Pre≥Thresh → Post<Thresh) 1 0.3%
10 Upward (Pre<Thresh → Post≥Thresh) 28 9.5%
20 Upward (Pre<Thresh → Post≥Thresh) 41 13.9%
30 Downward (Pre≥Thresh → Post<Thresh) 1 0.3%
30 Upward (Pre<Thresh → Post≥Thresh) 32 10.9%

Note for Pathologist: The threshold crossing table tells you how many cases moved across the critical Ki67 cutoffs (10%, 20%, 30%) due to AI. “Upward” crossing means the value went from below the cutoff to at or above it. This is clinically important because crossing the 20% or 30% threshold can change a patient’s molecular subtype classification and, consequently, their treatment plan.

12.4.1 Detailed Crossing Patterns by Threshold

Threshold Crossing Patterns by Ki-67 Threshold
Movement patterns relative to clinical cutoffs
Threshold (%) Crossing Pattern N Cases Mean Δ Ki-67 (%) Median Δ Ki-67 (%) Mean Crossing Magnitude (%)
10 Stable Above 221 6.78 6.50
10 Stable Below 44 1.61 1.50
10 Below → Above (Upward) 28 5.50 5.38 5.50
10 Above → Below (Downward) 1 −1.25 −1.25 1.25
20 Stable Above 142 6.69 6.00
20 Stable Below 111 2.99 2.75
20 Below → Above (Upward) 41 10.72 10.00 10.72
30 Stable Below 165 4.32 3.50
30 Stable Above 96 6.57 5.88
30 Below → Above (Upward) 32 11.95 13.12 11.95
30 Above → Below (Downward) 1 −3.00 −3.00 3.00

12.5 Magnitude and Direction of Threshold Crossings

Waterfall plot showing all cases and highlighting threshold crossers.


Threshold Crossing Summary:
Cases crossing 20% threshold: 41 
Cases crossing 30% threshold: 33 
Cases crossing both: 7 
Total cases crossing any threshold: 67 

Note for Pathologist: The waterfall plot sorts every case by the amount Ki67 changed. Cases highlighted in red or orange crossed a clinical threshold. If the plot is heavily right-shifted (more positive changes), AI systematically increases Ki67. The threshold-crossing cases are the ones most likely to have their treatment plan altered.

12.6 Range-Specific Bias Analysis

Analyze whether AI bias varies by Ki-67 range (Bland-Altman by range).

Ki-67 Bias by Pre-AI Range
Does AI systematically shift values more in certain ranges?
Ki-67 Range N Cases Mean Pre (%) Mean Post (%) Mean Bias (%) SD Bias Median Bias (%) IQR Bias
Very Low (<10%) 72 5.39 8.51 3.12 2.98 2.75 3.31
Low (10-20%) 80 14.18 21.02 6.84 5.12 6.88 6.81
Moderate (20-30%) [Critical]1 45 24.32 31.50 7.18 5.77 7.25 8.75
High (30-50%) 57 37.61 45.65 8.04 5.72 7.50 7.50
Very High (≥50%) 40 66.55 70.78 4.23 6.54 4.00 6.06
1 Critical range: Luminal A/B boundary (20%) and high proliferation cutoff (30%)

12.6.1 Visualization: Bias by Range


Statistical Test: Is bias proportional to Ki-67 level?
Correlation (Pre Ki-67 vs Change): 0.004 
P-value: 0.948 
→ No significant proportional bias

12.7 Passing-Bablok Regression: Systematic Bias Detection

Passing-Bablok regression is a non-parametric method for detecting proportional and constant bias.

12.8 Threshold Position Effect on Agreement

Do cases near thresholds have higher disagreement?

Agreement and Bias by Threshold Proximity
Do cases near clinical thresholds show different patterns?
Proximity to Threshold N Cases Mean Pre-AI SD (%) Mean Post-AI SD (%) Mean Δ Ki-67 (%) SD of Change Δ Disagreement (Post-Pre)1
Away from thresholds 191 3.53 4.06 5.02 5.30 0.53
Borderline 20% (15-25%) 58 3.74 3.78 7.01 5.34 0.04
Borderline 30% (25-35%) 45 4.67 5.02 7.94 5.61 0.35
1 Negative change = AI reduces disagreement; Positive = AI increases disagreement

12.9 Sensitivity Analysis: Alternative Threshold at 25%

What if the Luminal A/B cutoff were 25% instead of 30%?

Sensitivity Analysis: Alternative Threshold at 25%
Reclassification with Alternative 25% Threshold
Comparison: Standard 30% vs Alternative 25% cutoff
Reclassification Pattern N Cases Proportion
Luminal A → Luminal B 38 12.9%
Luminal B → Luminal A 1 0.3%
No change 255 86.7%

Comparison:
Reclassified cases with 30% threshold: 33 
Reclassified cases with 25% threshold: 39 
Difference: 6 

12.10 Molecular Subtype Reclassification Due to Ki-67 Changes

Link Ki-67 threshold crossings to molecular subtype changes.

Molecular Subtype Changes by Ki-67 Threshold Crossing
Association between threshold crossing and subtype reclassification
Threshold Crossing Subtype Stable Subtype Changed Total % Changed
No crossing 805 81 886 9.1
Crossed 30% only 29 80 109 73.4
Crossed 20% only 113 6 119 5.0
Crossed both 15 33 48 68.8

12.11 Clinical Decision Tree: Ki-67 → Subtype → Treatment

Visualize the clinical pathway from Ki-67 value to treatment decision.

Clinical Decision Tree: Ki-67 → Molecular Subtype → Treatment
Step 1: Ki-67 Assessment
  ├─ Ki-67 < 20%:
  │   ├─ If ER+ PR+ HER2-: Luminal A
  │   │   └─ Treatment: Endocrine therapy alone
  │   └─ If ER+ PR- HER2-: Luminal A (weak PR)
  │       └─ Treatment: Endocrine therapy ± chemotherapy (consider PR status)
  └─ Ki-67 ≥ 20% (or ≥ 30% in some guidelines):
      ├─ If ER+ HER2-: Luminal B
      │   └─ Treatment: Endocrine therapy + chemotherapy
      └─ Higher proliferation suggests more aggressive disease
Impact of AI-Induced Ki-67 Increase:
Treatment Decision Impact of AI-Induced Ki-67 Changes
Assuming ER+ HER2- disease and 20% Ki-67 threshold
Treatment Decision Change N Cases % of Cases
Endocrine only → Endocrine + Chemo 41 13.9
No treatment change 253 86.1

Key Clinical Implication:
→ AI predominantly causes treatment ESCALATION (adding chemotherapy)
  Risk: Potential overtreatment if AI systematically overestimates Ki-67

12.12 Scatter Plots: Pre vs Post Ki-67 with Thresholds

Visual representation of individual case movements across thresholds.

12.13 Summary and Clinical Recommendations

===== KI-67 THRESHOLD ANALYSIS SUMMARY =====
1. SYSTEMATIC BIAS:
   Mean Ki-67 increase: 5.86 %
   Median Ki-67 increase: 5 %
   → AI systematically INCREASES Ki-67 estimates
2. THRESHOLD CROSSINGS:
   Cases crossing 20% threshold: 41 ( 13.9 %)
   Cases crossing 30% threshold: 33 ( 11.2 %)
3. RANGE-SPECIFIC EFFECTS:
   Mean bias in 20-30% range: 7.18 %
   → Highest impact in Luminal A/B boundary zone
4. CLINICAL IMPLICATIONS:
   a) Molecular subtype classification:
      - Luminal A vs B distinction may be affected
      - Consider Ki-67 trend in context of other markers
   b) Treatment decisions:
      - Potential chemotherapy escalation in borderline cases
      - Exercise caution near 20% and 30% cutoffs
   c) Quality assurance:
      - Regular calibration checks for AI-assisted Ki-67
      - Human oversight especially for threshold-proximate cases
5. RECOMMENDATIONS:
   ✓ Use AI as decision support, not replacement
   ✓ Review all cases within ±5% of clinical thresholds manually
   ✓ Consider percentage point ranges (e.g., 18-22%) instead of hard cutoffs
   ✓ Integrate clinical context (tumor grade, stage, patient factors)
   ✓ Monitor AI performance with periodic consensus reviews
   ✓ Document when AI recommendations are overridden and reasons

12.14 Conclusion

This comprehensive threshold analysis reveals that AI-assisted Ki-67 assessment systematically increases proliferation estimates, with the greatest impact in the clinically critical 20-30% range (Luminal A/B boundary). While this improves interobserver agreement, it has significant implications for:

  1. Molecular subtype classification: Cases may shift from Luminal A to Luminal B
  2. Treatment decisions: Potential escalation to chemotherapy for borderline cases
  3. Quality assurance: Need for human oversight near clinical thresholds

The systematic upward bias suggests AI algorithms may be calibrated to favor sensitivity over specificity in detecting high proliferation, or may reflect training data characteristics. Clinicians should exercise particular caution when AI-derived Ki-67 values fall within ±5% of the 20% and 30% thresholds, considering integrating additional clinical and pathologic features before making treatment decisions.