Machine learning for sustainable chemistry Jonathan Hirst University of Nottingham, UK
Challenges for the application of machine learning for sustainable chemistry include: data acquisition, building predictive models, making best use of data, utilising knowledge of expert chemists, navigating the complexity of potentially conflicting green metrics and the interpretability of models. We will present our recent work in a couple of these areas. Reaction yield is an important consideration in sustainable chemistry and the use of machine learning methods for the prediction of reaction yield is an emerging area [1]. I will discuss two of our recent studies. The first involves the application of machine learning to sensor data in synthetic chemistry [2], based on a dataset that had been collected using the DeepMatter DigitalGlassware cloud platform. In the second study, we investigated the applicability of support vector regression (SVR) for predicting reaction yields [3], using a combinatorial set of Buchwald-Hartwig amination reactions. Structure-based SVR models out-performed the quantum chemical SVR models, along the dimension of each reaction component. When data are expensive to generate, for example, in drug discovery, active search strategies can help minimize the number of compounds that have to be synthesized and assayed, improving the sustainability of such discovery processes. In the context of a drug design problem, we have investigated the application of a data- driven adaptive Markov chain approach, where the acceptance probability is given by a probabilistic surrogate of the target property, modelled with a maximum entropy conditional model. We have applied the approach to a lead development search for an antagonist of an alpha-v integrin, using a molecular docking score as the optimisation function. We have discovered compounds with greater predicted activity than compounds found in our previous work [4] by employing two strategies. We have substantially increased the search space explored, to the order of ~10 20 possible compounds, and we have considered receptor flexibility by docking to an ensemble of snapshots from molecular dynamics simulations. Acknowledgements : This work was supported by the Engineering and Physical Sciences Research Council (EPSRC) [grant number EP/S035990/1] and JDH is supported by the Royal Academy of Engineering under the Chairs in Emerging Technologies scheme. References 1. A.L. Haywood, J. Redshaw, T. Gärtner, A. Taylor, A.M. Mason & J.D. Hirst. Machine Learning for Chemical Synthesis. In Machine Learning in Chemistry: The Impact of Artificial Intelligence , Ed. Cartwright, H. RSC, London, 169–194 (2020). 2. J. Davies; D. Pattison & J.D. Hirst. Machine learning for yield prediction for chemical reactions using in situ sensors. J. Mol. Graph. Model ., 118 , 108356 (2023). 3. https://doi.org/10.1016/j.jmgm.2022.108356A.L. Haywood, J. Redshaw, M.W.D. Hanson-Heine, A. Taylor, A. Brown, A.M. Mason, T. Gärtner & J.D. Hirst. Kernel methods for predicting yields of chemical reactions. J. Chem. Inf. Mod ., 62 , 2077- 2091 (2022). 4. http://dx.doi.org/10.1021/acs.jcim.1c00699D. Oglic, S.A. Oatley, S.J.F. Macdonald, T. McInally, R. Garnett, J.D. Hirst, T. Gärtner. Active search for computer-aided drug design. Mol. Inf ., 37 , 1700130 (2018). http://dx.doi.org/10.1002/ minf.201700130
PC04
© The Author(s), 2023
Made with FlippingBook Learn more on our blog