ESTRO 2026 - Abstract Book PART I

S1466

Interdisciplinary - Other

ESTRO 2026

Multi-modal machine learning (ML) algorithms are gaining traction in predictive patient tasks due to their alignment with clinical practice, where all available patient data is analyzed. Over 50%1 of cancer patients receive some form of radiotherapy (RT), yet this data is not reflected in multi-modal clinical frameworks due to data siloes, difficulties in querying bulk data, and expertise requirements. Existing RT data tools typically not optimized for bulk patient analysis, not open source, or optimized for workflows more familiar to medical physicists (such as CERR using MATLAB or slicerRT with its own interface). Meanwhile, most existing clinical ML pipelines are employed in the Python programming language2, demonstrating a gap in tools that can easily integrate RT data into existing ML workflows processing imaging, text, and other tabular data. Material/Methods: We developed an open source python package (available at [github.com/anonymized/rt4ml]) to streamline the integration of dose-volume histogram (DVH) data into multi-modal ML pipelines. The package can process individual or, importantly, bulk and/or aggregated patient cohorts. rt4ml enables clinical ML pipelines to calculate and include core dose statistic calculations, including custom Dx/Vx, and other planning metrics such as BED and EQD2. Novel contributions of the package include cleaning methods to address common DVH data issues, including resolving and selecting multiple structures with similar names. The package includes both pre-defined and customizable dictionaries of organs-at-risk (OARs) based on disease sites, which can be used to standardize and select variables of interest across patient cohorts. Results: A pilot analysis using rt4ml was conducted on a CNS patient cohort to iterate and test the functionality of the package. Exporting DVH data from a mirror set up to ANONYMIZED’s treatment planning system, rt4ml was able to process bulk DVH data for 4090 radiation courses. The package facilitated course filtering and cohort selection, grouping and transformation of row- level dose information into patient-level dose statistics (e.g., D5-D100 in 5Gy increments), resolution of multiple structures and OARs with similar names (e.g., PTV1 and PTV2), outlier identification, and integrated into a clinical dataset for ML prediction of re- irradiation likelihood. Conclusion: RT data remains underutilized in clinical ML algorithms, indicating an urgent need for tools that integrate DVH data into existing multi-modal ML frameworks. We develop, test, and present rt4ml, an open source Python package that enables bulk pre- processing and integration of DVH data into existing clinical ML pipelines. Future work will support

The GAN successfully reconstructed canonical tumor- growth functions (Fig. 1, left) and recovered missing segments with smooth temporal continuity (Fig. 1, right). On synthetic data, mean R² = 0.91, F1 = 0.88 ± 0.03, and ROC-AUC = 0.93 at 30 % missingness. Hit- rate exceeded 80 % for ≤ 20 % missingness and > 65 % at 50 %. Applied to PSA trajectories, the model restored biologically plausible rebound dynamics and accurately imputed left- and right-censored intervals within ±1 week of the mechanistic model predictions (Fig 2, left and right). A complementary GAN–GAF– Diffusion module integrating denoising-diffusion (DDIM) refinement has now completed its first training round, achieving ≈ 3 % improvement in structural similarity and demonstrating strong convergence, supporting the hybrid framework’s scalability. Conclusion: The conditional GAN–GAF model achieves high-fidelity reconstruction of incomplete oncologic time series across synthetic and real datasets. Early diffusion- based refinements further enhance temporal smoothness and robustness, advancing toward a clinically translatable digital-twin platform for precision radiotherapy modeling. References: Brady-Nicholls R Nat Commun 2020; 11:1750. 2. Wen P Y J Clin Oncol 2017; 35:2439. 3. Roach M Int J Radiat Oncol Biol Phys 2006; 65:965. 4. Walker S A Trends Cancer 2021; 7:3. 5. Eisenhauer E A Eur J Cancer 2009; 45:228. 6. Lopez Alfonso J C JCO Clin Cancer Inform 2019; 3:1. 7. De Wilde D Med Image Anal 2025; 101:103454. 8. Skandarani Y J Imaging 2023; 9:69. 9 Ibrahim M Comput Biol Med 2025; 189:109834. Keywords: Imputation, missing data, generative models Digital Poster 5040 rt4ml: An open source python package to support multi-modal integration of radiotherapy data into

clinical machine learning pipelines Shreya Chappidi 1,2 , Andra V Krauze 1

1 Radiation Oncology Branch, National Cancer Institute, Bethesda, USA. 2 Department of Computer Science and Technology, University of Cambridge, Cambridge, United Kingdom

Purpose/Objective:

Made with FlippingBook - Share PDF online