ESTRO 2026 - Abstract Book PART II

S1563

Physics - Autosegmentation

ESTRO 2026

Material/Methods: Planning CTs from a single institute (861: training [1], 100: temporal validation) were used to train six 3D models for right/left breast CTV segmentation simultaneously: UNet, SegResNetDS, DynUNet (MONAI v1.3 – optimized with optuna), nnU-Net (Total Segmentator), MedSAM2 (MS-A: CT-specific weights, MS-B: general medical-image weights). MedSAM2 required a priori bounding-box prompt, unlike the others. Standard metrics - Dice Similarity Coefficient, Average Surface Distance and Hausdorff Distance - evaluated prediction compared to ground truth. A Friedman test followed by post hoc pairwise comparisons through Conover test were conducted on the temporal validation set. The resulting best models were used to build model-based probability maps and quantify the differences between higher (100%) and lower (25%) concordance iso-probabilities to clinician's CTV. Results: All models achieved overall satisfactory performance (found to be aligned with the inter-observer variability - IOV - derived in [2]: DSC = 0.90). Among them, UNet, DynUNet, nnUNet, and MedSAM2-A demonstrated the highest equivalent accuracy, with average ASD=1.5 mm and HD95=3.8 mm. Conover analysis revealed lower performance for two models out of six (SegResNetDS and MS-B), excluded from probability map construction (see Fig. 1 where different metrics were compared). Analysis of the 100% versus 25% isoprobability volumes on the temporal validation dataset (mean difference = 123 ± 81 cm³) highlighted greater uncertainty at the lateral and cranio-caudal CTV borders. An example BC patient is shown in Fig. 2. ECDF (Empirical Cumulative Distribution Frequency) computed on residuals clinical contours (average on the different clinicians) outside the 25-100% iso- probability band showed a cumulative frequency of 0.9 for a residual value of 14.2%±2.5 on average (residual sum = under + over segmentation residual).

with corrections more often, which is reflected in the increasing precision. Conclusion: Smoothing uncertainty maps followed by thresholding increases spatial coherency and correction overlap, while reducing the number of uncertain areas, thus easing expert contour adaptation. This technique leads to a more interpretable and efficient visualization of predictive uncertainty for clinical use. References: 1. Gal, Y. et al. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning. Proceedings of The 33rd International Conference on Machine Learning (2016) 2. Kamnitsas, K. et al. Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation. Medical ImageAnalysis 36 (2017)3. Maruccio, F.C. et al. Leveraging network uncertainty to identify regions in rectal cancer clinical target volume auto- segmentations likely requiring manual edits. Physics and Imaging in Radiation Oncology 34 (2025)4. Zou, K et al. A review of uncertainty estimation and its application in medical imaging. Meta-Radiology 1(1) (2023) Keywords: Uncertainty, Adaptive Radiotherapy Digital Poster 2143 Optimizing deep ensemble models for probabilistic CTV breast segmentation Cecilia Riani 1,2 , Maria Giulia Ubeira Gabellini 1 , Gabriele Palazzo 1 , Giuseppe Ricciardi 1 , Antonella del Vecchio 1 , Alessandra Palma 3 , Anna Balsamo 4 , Angela Coniglio 4 , Claudio Fiorino 1 1 Medical Physics, IRCCS San Raffaele Scientific Institute, Milan, Italy. 2 Physics Department, Radiation Biophysics and Radiobiology Laboratory, Pavia, Italy. 3 Centro Nazionale Intelligenza Artificiale, HTA e Tecno- assistenza, Istituto Superiore di Sanità, Rome, Italy. 4 Department of Human Health, Animal Health and Ecosystem (One Health) and International Relations (DOHRI), Ministry of Health, Rome, Italy Purpose/Objective: Effective segmentation of organs-at-risk (OARs) and clinical target volume (CTV) is essential. Supervised deep learning (DL) models can achieve high segmentation accuracy through careful hyperparameter tuning. However, their reliability also hinges on addressing uncertainties from variability in clinical contouring practices. This study systematically evaluates six advanced 3D DL models for automatic CTV segmentation in whole-breast radiotherapy. It further leverages the top-performing models to construct a probability map, aiming to improve consistency and mitigate bias in clinical predictions.

Made with FlippingBook - Share PDF online