The method is trained on the data that were available, but it is meant to be re-trainable as soon as new data are published. It would be great to be really sure that even someone else will be able to do it. In case we receive any feedback, we would be really happy to improve our Github repository so as to make the reproduction easier!
We do care about reproducibility. In case we receive any feedback, we would be really happy to improve our Github repository and/or submitted manuscript so as to make the reproduction easier!
Systematically improvable machine learning potentials could have a significant impact on the range of properties that can be modelled, but the toolchain associated with using them presents a barrier to entry for new users. Attempting to reproduce some of our results will help us improve the accessibility of the approach.
Popular descriptors for machine learning potentials such as the Behler-Parinello atom centred symmetry functions (ACSF) or the Smooth Overlap of Interatomic Potentials (SOAP) are widely used but so far not much attention has been paid to optimising how many descriptor components need to be included to give good results.
In theory, reproducing this paper should only require a clone of a public Git repository, and the execution of a Makefile (detailed in the README of the paper repository at https://github.com/psychoinformatics-de/paper-remodnav). We've set up our paper to be dynamically generated, retrieving and installing the relevant data and software automatically, and we've even created a tutorial about it, so that others can reuse the same setup for their work. Nevertheless, we've for example never tried it out across different operating systems - who knows whether it works on Windows? We'd love to share the tips and tricks we found to work, and even more love feedback on how to improve this further.