ESTRO 2026 - Abstract Book PART II

S2299

Physics - Machine learning and AI algorithms

ESTRO 2026

Digital Poster 2962

and combined 3D-image and clinical features as input (Figure 1). Model performance was evaluated using area under the receiver operator curve (AUC) and expected calibration error (ECE).

Prediction of lung cancer among the National Lung Screening Trial Program participants with Machine Learning methods Mayur Munshi 1 , Gareth Price 2 , Andrew Reilly 1 , Gordon Cowell 3 1 Clinical Physics and Bio-engineering, NHS Greater Glasgow and Clyde, Glasgow, United Kingdom. 2 School of Medical Sciences, University of Manchester, Manchester, United Kingdom. 3 Department of clinical Radiology, NHS Greater Glasgow and Clyde, Glasgow, United Kingdom Purpose/Objective: Lung cancer is a major healthcare challenge worldwide, with high mortality rates. Early diagnosis is key to improving the survival rate of lung cancer patients. Lung screening trial programs are designed for early intervention but struggle with a large volume of data, shortage of experts, and high false positive rates. The primary aim of this study was to determine whether the future incidence of lung cancer could be predicted from low-dose CT screening scans(LDCT) with no visible signs of cancer nodules. The secondary aim was to reduce false positives and false negatives using machine learning methods that utilised LDCT scans and clinical risk features. Material/Methods: The dataset used in this study were anonymised National Lung Screening Trial (NLST) participants’(n=1258) low-dose CT (LDCT) scans and fourteen independent clinical risk features. The datasets were divided in to three independent cohorts: training(68%),validation(16%), and testing(16%), with an equal distribution of cancer and non-cancer participants, clinically confirmed across eight years of follow-up under NLST conditions. Three types of machine learning (ML) models were developed. The Image-based ML model was a hybrid modified convolutional neural network with bidirectional long short-term memory (CNN-BiLSTM), using a sequence of LDCT scans taken two years apart, without any cancer nodule annotations. The Clinical risk-based ML model was a voting classifier (VC) that utilised fourteen clinical risk features statistically identified as significant contributors to the lung cancer prediction. Finally, the weighted average ensemble ML model combined the predictive strengths of both these models and applied an optimal weight to each model’s predictions. Each model was evaluated with area under the receivers operating characteristics curve and compared against each other for accuracy, precision, F1 score and each model’s ability to predict true cases verses false cases.

Results A total of 652 patients with mid-treatment scans were available from which 200 were randomly selected as independent test set. Both models achieved good classification, i.e. high AUC, and calibration, i.e. low ECE, on the independent test set (Figure 2). The pre- treatment model had an AUC of 0.78 (confidence interval: 95%CI [0.74-0.81]) and an ECE of 0.04 on the independent test set. The mid-treatment model achieved an improved AUC of 0.82 (95%CI [0.79-0.85]) and similarly low ECE of 0.05.

Conclusion These findings show the added benefit of incorporating mid-treatment data in improving predictive performance of NTCP models for late xerostomia in HNC patients. Demonstrating the potential of mid-treatment NTCP quantification, which could further strengthen the role of NTCP modelling in personalized treatment and adaptive radiotherapy.

Made with FlippingBook - Share PDF online