Popular descriptors for machine learning potentials such as the Behler-Parinello atom centred symmetry functions (ACSF) or the Smooth Overlap of Interatomic Potentials (SOAP) are widely used but so far not much attention has been paid to optimising how many descriptor components need to be included to give good results.
Even though the approach in the paper focuses on a specific measurement (clumped isotopes) and how to optimize which and how many standards we use, I hope that the problem is general enough that insight can translate to any kind of measurement that relies on machine calibration. I've committed to writing a literate program (plain text interspersed with code chunks) to explain what is going on and to make the simulations one step at a time. I really hope that this is understandable to future collaborators and scientists in my field, but I have not had any code review internally and I also didn't receive any feedback on it from the reviewers. I would love to see if what in my mind represents "reproducible code" is actually reproducible, and to learn what I can improve for future projects!
The current code is written in Torch, which is no longer actively maintained. Since deep learning in nanophotonics is an area of active interest (e.g. for the design of new metamaterials), it is important to update the code to use a more modern deep learning library such as tensorflow/keras
I guess it could be a cool learning experience. The paper is written with knitr, uses a seed, is part of the R package it describes, was openly written using version control (SVN, R-Forge) and is available in an open access journal (@up_jors).