The code and data are both on GitHub. The paper has been published in Wellcome Open Research and has been replicated by multiple other authors.
Popular descriptors for machine learning potentials such as the Behler-Parinello atom centred symmetry functions (ACSF) or the Smooth Overlap of Interatomic Potentials (SOAP) are widely used but so far not much attention has been paid to optimising how many descriptor components need to be included to give good results.
Metadata annotation is key to reproducibility in sequencing experiments. Reproducing this research using the scripts provided will also show the current level of annotation in years since 2015 when the paper was published.