ESTRO 2026 - Abstract Book PART II

S2191

Physics - Inter-fraction motion management and daily adaptive radiotherapy

ESTRO 2026

for primary kidney cancer (TROG 15.03 FASTRACK II): a non-randomised phase 2 trial”, Lancet Oncol. 2024; 25:308-316.2. Videtic et al. “NRG Oncology RTOG 0915 (NCCTG N0927): A randomized phase II study comparing 2 stereotactic body radiation therapy (SBRT) schedules for medically inoperable patients with Stage I peripheral non-small cell lung cancer”, Int J Radiat Oncol Biol Phys. 2015 Jul 17;93(4):757–764.3. Timmerman R. “A story of hypofractionation and the table on the wall,” Int J Radiat Oncol Biol Phys. 2022 Jan 1;112(1):4–21. Keywords: SABR, kidney, CBCT-based A deep reinforcement learning approach for real- time adaptive treatment plan optimization in proton therapy applied on a static lung tumor case Mélanie Ghislain 1 , Estelle Loÿen 1 , Ana Maria Barragan Montero 2 , Benoit Macq 1 1 ICTEAM, UCLouvain, Louvain-la-Neuve, Belgium. 2 IREC, UCLouvain, Bruxelles, Belgium Purpose/Objective: Proton therapy offers sharp dose profiles, enabling highly conformal dose distributions, but also increasing the sensitivity of the treatments to Digital Poster Highlight 4641 anatomical changes. Online adaptive planning can compensate for daily changes but demands fast optimization algorithms. Despite many efforts, current approaches remain slow and typically require several minutes [1]. In order to adapt a plan in the order of seconds, we propose a patient-specific real-time

Results: We illustrate the proposed algorithm with one lung cancer patient (prescription of 60 Gy/30 fx). Training was done on the planning MidP CT (T1) and evaluation on a repeated MidP CT (T2). A conventional plan was also optimized on T1 and evaluated on T2 (non- adapted plan). Resulting DVHs can be found in Figure 2. We can see that our approach achieves D95,PTV = 52.77Gy and D5,PTV = 65.0 Gy when applied on T2 (solid lines) compared to D95,PTV = 34.33 Gy and D5,PTV = 64.77 Gy for the non-adapted plan (dashed lines). This RL-based adapted plan was computed in 54 seconds.

decision-making framework based on deep Reinforcement Learning (RL). RL agents will

dynamically control spot-based dose delivery using pencil beam scanning and adapt to anatomy changes.

Conclusion: Preliminary results indicate that the RL agent was able to adapt the spot sequence and have a better target coverage than the static baseline for the same dose objectives under inter-fraction variations. This proof of concept shows the potential of a spot-level RL policy, intended for online adaptive proton therapy to compensate for inter-fractional changes. Future works to improve plan quality include the integration of simulated inter-fraction variations in the training. We also aim to integrate intra-fraction motion to test the method for real-time adaptive proton treatments on dynamic scenarios for several patients. References: [1] Bobi ć M, et al. “Multi-institutional experimental validation of online adaptive proton therapy workflows.” Physics in Medicine & Biology 69, 165021

Material/Methods: We train patient-specific Deep Q-Learning agents to sequentially adjust spots’ position, energy and weight. At each step, the agent receives 2D inputs (delivered spot positions and weights, projected target mask, and current spot position) and chooses among 5 actions, the four lateral beam displacements and the delivery of the current spot with a weight that is updated by interval of 0.25 MU (Figure 1). The reward function is based on DVH-related metrics encouraging clinically acceptable dose distributions. The number of trained agents depends on the number of energy layer, N. All agents are executed sequentially, and each agent has access to the list of spots delivered by the preceding agents.

Made with FlippingBook - Share PDF online