Feature identification in molybdenum carbides: graph neural networks vs. human empirical search – who’s the winner? Eduardo Aguilar Bejarano 1,2,3 , Luis Arrieta Araya 4 , Mauricio Gutierrez 5 , Grazziela Figueredo 3 , Ender Özcan 3 , Simon Woodward 1,2 , Ignacio Borge-Durán 6 1 The Glaxo Smith Kline Carbon Neutral Laboratories for Sustainable Chemistry, University of Nottingham, UK, 2 School of Chemistry, University of Nottingham, UK, 3 School of Computer Science, University of Nottingham, UK, 4 School of Chemical Engineering, University of Costa Rica, Costa Rica, 5 School of Chemistry, University of Costa Rica, Costa Rica, 6 School of Chemistry, Bar-Ilan University, Israel Transition metal carbides (TMCs) are materials with a wide range of applications due to their interesting properties. Such properties are correlated to crystalline ordered-disordered phase transitions, that occur when heating or cooling the TMC. Understanding the energetics of this process allows tuning of TMCs properties for a given application. Traditionally, DFT calculations have been used to try to model TMC structures, and phase transitions through understanding the distribution of the carbon (carbide) atoms within the overall metal lattice. Unfortunately, DFT is computationally expensive for solid state TMCs, and for large lattices DFT models are presently untenable. A simple empirical model [1] is presently used as a DFT replacement for MoC 0.5 systems, using three different variables related to the distribution of carbide atoms within the molybdenum lattice. The phase transition predictions of this simple model are now very accurate. Unfortunately, this required months of trial-and-error screening for the (human) investigators to figure out the best features to represent the cell thus allowing energetic predictions. Herein, we evaluate the performance of an automated feature generation approach using a Graph Neural Network ( GNN ) to predict the internal energy of MoC 0.5 . We have collected a total of 1065 molybdenum carbide structures with their corresponding DFT calculated internal energy. The structures were represented as graphs, where the atoms are represented as nodes and atom-to-atom interactions as edges. The nodes were loaded with node features of atom identity (whether it is carbon or molybdenum) and its encoded 3D coordinates. We have applied a 5-k-fold strategy to evaluate the generalisability and robustness of the model. [2] The statistics of our approach for the 5 folds (mean ± std) are: R 2 0.923 ± 0.005 , M AE 0.133 ± 0.003, RMSE 0.172 ± 0.002. When using the empiric model, the metrics obtained for the 1065 structures are: R 2 0.928 ± 0.014 , M AE 0.129 ± 0.007, RMSE 0.165 ± 0.013. A T-test has shown that there is not statistical difference between models, which means that GNNs are able to learn and predict on this system within the same accuracy as a human do, but only taking 10 minutes of training per model. References 1. Ignacio Borge-Durán, Denial Aias, and Ilya Grinberg. Modelling of high-temperature order–disorder phase transitions of non- stoichiometricMo2C and Ti2C from first principles. PhysicalChemistry Chemical Physics, 23(39):22305–22312,2021. 2. Eduardo Aguilar, Luis Arrieta, Mauricio Gutierrez, Grazziela Figueredo, Ender Ozcan, Simon Woodward, Ignacio Borge, manuscript in preparation.
P03
© The Author(s), 2023
Made with FlippingBook Learn more on our blog