ESTRO 2026 - Abstract Book PART II

S1560

Physics - Autosegmentation

ESTRO 2026

Proffered Paper 2054 An LLM-driven multimodal delineation system for esophageal cancer Long Yang, Hongcheng Zhu, Jiazhou Wang, Weigang Hu Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai, China Purpose/Objective: To develop a large-language-model (LLM)–driven multimodal delineation system for esophageal cancer radiotherapy that integrates textual clinical reasoning with imaging data to achieve accurate and individualized target volume delineation across multiple centers and countries. Material/Methods: Planning CTs and corresponding multimodal pretreatment assessments from 1,740 patients across five centers in China were retrospectively collected. Clinical texts from pretreatment assessments— including barium esophagram, magnetic resonance imaging (MRI), and endoscopy—were encoded using an instruction-tuned LLM and fused with imaging representations through a bidirectional cross- attention mechanism. The proposed model was trained and validated on an internal cohort (N = 867) to jointly reason over textual and visual cues for target delineation, and externally validated (N = 871) across four independent centers with diverse scanners and treatment strategies. Its performance was compared with four state-of-the-art vision-only and multimodal models (3D ResUNet, Swin UNETR, Radformer, and LLM-Seg) using Dice similarity coefficient (Dice), 95% Hausdorff distance (HD95), and average surface distance (ASD). To assess clinical utility, experienced radiation oncologists from five countries participated in a reader study, in which manual and AI-assisted delineation times were quantitatively recorded and subjective clinical satisfaction was qualitatively evaluated. Results: Our model consistently achieved the highest performance across all test centers, demonstrating strong cross-institutional generalization. The internal cohort achieved a Dice of 0.798 (0.781–0.814), with HD95 = 7.37 mm and ASD = 1.60 mm. The mean Dice across external centers was 0.703, exceeding the best vision-only baseline by +9.1%. HD95 and ASD were reduced by 39% and 47%, respectively. In the clinical evaluation, the median delineation time was reduced from 24.3 min (manual) to 6.5 min (AI-assisted) (p < 0.001), with a major modification ratio below 15%. Performance trends were consistent across countries, supporting our model’s robustness and practical

By incorporating textual clinical reasoning into the target delineation workflow, our model bridges human decision logic with AI precision, substantially improving delineation efficiency, consistency, and clinical usability in real-world radiotherapy practice. Keywords: Target volume, Large language model, Esophageal Digital Poster 2116 Continuous Monitoring of AI-Based OAR Segmentation in Head-and-Neck Radiotherapy: Geometric and Dosimetric Validation Anne Ivalu Sander Holm 1 , Maiken Mondrup Hjelt 2 , Emil Rønn 1 , Tine Bisballe Nyeng 1 , Lise BechJellesmark Thorsen 1 , Jesper Folsted Kallehauge 3 , Hanne Primdahl 1 , Christian Nicolaj Andreassen 1 , Kasper Toustrup 1 , Line Meinertz Hybel Schack 1 , Jesper Grau Eriksen 1 1 Department of Oncology, Aarhus University Hospital, Aarhus, Denmark. 2 Experimental Clinical Oncology, Aarhus University Hospital, Aarhus, Denmark. 3 Danish Center for Particle Therapy, Aarhus University Hospital, Aarhus, Denmark Purpose/Objective: Artificial intelligence (AI)–based segmentation of organs of interest is increasingly adopted in radiation therapy to enhance efficiency and consistency. Ensuring clinical reliability requires continuous performance monitoring across varying patient anatomies and imaging conditions. Both geometric and dosimetric evaluations are critical, as even small contour deviations can cause significant dose discrepancies. This study presents a structured approach to continuous AI performance monitoring for robust quality assurance in clinical practice. Material/Methods: Since January 2024, AI-assisted OAR segmentation has been implemented for all head-and-neck (HN) radiation therapy patients at a single institution, totaling 288 cases. OAR delineations were generated using a combination of CE-marked and in-house developed AI models. In routine workflows AI contours were reviewed and corrected by expert radiation oncologists before approval. Both AI-generated and final clinician-adjusted contours were archived and compared using geometric metrics—Surface Dice at 2 mm (SD2mm) and mean Hausdorff distance (mHD). After one year of deployment, minor post-processing adjustments were introduced based on ongoing monitoring and user feedback. During the subsequent six months (118 cases), geometric and dosimetric assessments were jointly performed. Dosimetric evaluation included key dose–volume histogram (DVH) parameters, such as near-maximum dose for the

deployability. Conclusion:

Made with FlippingBook - Share PDF online