Loading

Original Research Open Access

Pairwise External Validation of Plasma Biomarker–Based Machine Learning Models for Amyloid PET Prediction: Implications for Calibration and Clinical Utility

  • 1University of Southern California, USA
  • 2Department of Immunology and Immune Therapeutics and Norris Cancer Comprehensive Cancer Center, Keck School of Medicine, Los Angeles, California 90033, USA
  • 3Thomas Lord Department of Computer Science, Viterbi School of Engineering, University of Southern California, Los Angeles, California 90089, USA
+ Affiliations - Affiliations

Corresponding Author

Ebrahim Zandi, zandi@usc.edu

Received Date: April 21, 2026

Accepted Date: June 02, 2026

Abstract

Blood-based biomarkers have demonstrated strong performance for identifying cerebral amyloid pathology within individual cohorts. However, their clinical utility depends on portability across populations and assay platforms. The impact of cross-cohort deployment on clinically actionable metrics such as negative predictive value remains insufficiently characterized. We analyzed data from two independent cohorts: the Alzheimer’s Disease Neuroimaging Initiative (n = 885) and the Anti-Amyloid Treatment in Asymptomatic Alzheimer’s Disease study (n = 822). Machine learning models were developed within each cohort to predict amyloid positron emission tomography status and continuous amyloid burden. Performance was evaluated using area under the receiver operating characteristic curve, accuracy, coefficient of determination, and root mean square error. Cross-cohort portability under pairwise external validation was assessed using bidirectional transfer without retraining. Calibration, predictive values, and decision curve analysis were used to evaluate clinical utility. Within-cohort discrimination was high, with area under the curve up to 0.917–0.918 in the Alzheimer’s Disease Neuroimaging Initiative and 0.870 in the Anti-Amyloid Treatment in Asymptomatic Alzheimer’s Disease cohort. Prediction of continuous amyloid burden was moderate (coefficient of determination up to 0.628 and 0.535, respectively). Cross-cohort deployment resulted in modest attenuation of discrimination but substantially greater degradation in clinically actionable performance. Negative predictive value declined from 0.831 to 0.644 when models trained in the Alzheimer’s Disease Neuroimaging Initiative were applied to the Anti-Amyloid Treatment in Asymptomatic Alzheimer’s Disease cohort, despite preserved discrimination. Calibration analyses demonstrated systematic probability misestimation, and decision curve analysis showed reduced net clinical benefit. Biomarker distributions differed across cohorts, consistent with dataset shift. Blood-based biomarker models retain discrimination across cohorts but exhibit clinically meaningful degradation in predictive value under real-world deployment conditions. Calibration instability and population differences critically affect rule-out performance. These findings highlight the need for cross-cohort validation, calibration assessment, and assay-consistent biomarker generation prior to clinical implementation.

Keywords

Alzheimer’s disease, Plasma biomarkers, Amyloid PET, Machine learning, Pairwise external validation, Calibration, Negative predictive value

Author Information X