Popular descriptors for machine learning potentials such as the Behler-Parinello atom centred symmetry functions (ACSF) or the Smooth Overlap of Interatomic Potentials (SOAP) are widely used but so far not much attention has been paid to optimising how many descriptor components need to be included to give good results.
It uses the drake R package that should make reproducibility of R projects much easier (just run make.R and you're done). However, it does depend on very specific package versions, which are provided by the accompanying docker image.
This was my third attempt at making a paper fully reproducible. To date I it's the most reproducible that I have published. I'm interested to know what stumbling blocks exist that I'm not aware of (aside from needing software like ArcGIS to fully rerun the complete analysis).