We invested a lot of work to make the analyses from the paper reproducible and we are very curious how the documentation could be improved and if people run into any problems.
This paper presents a fine example of high-throughput computational materials screening studies, mainly focusing on the carbon nanoclusters of different sizes. In the paper, a set of diverse empirical and machine-learned interatomic potentials, which are commonly used to simulate carbonaceous materials, is benchmarked against the higher-level density functional theory (DFT) data, using a range of diverse structural features as the comparison criteria. Trying to reproduce the data presented here (even if you only consider a subset of the interaction potentials) will help you devise an understanding as to how you could approach a high-throughput structure prediction problem. Even though we concentrate here on isolated/finite nanoclusters, AIRSS (and other similar approaches like USPEX, CALYPSO, GMIN, etc.,) can also be used to predict crystal structures of different class of materials with applications in energy storage, catalysis, hydrogen storage, and so on.
The negative surface enthalpies in figure 5 are surprising. At least one group has attempted to reproduce these using a different code and obtained positive enthalpies. This was attributed to the inability of that code to independently relax the three simulation cell vectors resulting in an unphysical water density. This demonstrates how sensitive these results can be to the particular implementation of simulation algorithms in different codes. Similarly the force field used is now very popular. Its functional form and full set of parameters can be found in the literature. However differences in how different simulation codes implement truncation, electrostatics etc can lead to significant difference in results such as these. It would be a valuable exercise to establish if exactly the same force field as that used here can be reproduced from only its specification in the literature. The interfacial energies of interest should be reproducible with simulations on modest numbers of processors (a few dozen) with run times for each being 1-2 days. Each surface is an independent calculation and so these can be run concurrently during the ReproHack.
In theory, reproducing this paper should only require a clone of a public Git repository, and the execution of a Makefile (detailed in the README of the paper repository at https://github.com/psychoinformatics-de/paper-remodnav). We've set up our paper to be dynamically generated, retrieving and installing the relevant data and software automatically, and we've even created a tutorial about it, so that others can reuse the same setup for their work. Nevertheless, we've for example never tried it out across different operating systems - who knows whether it works on Windows? We'd love to share the tips and tricks we found to work, and even more love feedback on how to improve this further.
Even though the approach in the paper focuses on a specific measurement (clumped isotopes) and how to optimize which and how many standards we use, I hope that the problem is general enough that insight can translate to any kind of measurement that relies on machine calibration. I've committed to writing a literate program (plain text interspersed with code chunks) to explain what is going on and to make the simulations one step at a time. I really hope that this is understandable to future collaborators and scientists in my field, but I have not had any code review internally and I also didn't receive any feedback on it from the reviewers. I would love to see if what in my mind represents "reproducible code" is actually reproducible, and to learn what I can improve for future projects!