The method is trained on the data that were available, but it is meant to be re-trainable as soon as new data are published. It would be great to be really sure that even someone else will be able to do it. In case we receive any feedback, we would be really happy to improve our Github repository so as to make the reproduction easier!
This paper presents a fine example of high-throughput computational materials screening studies, mainly focusing on the carbon nanoclusters of different sizes. In the paper, a set of diverse empirical and machine-learned interatomic potentials, which are commonly used to simulate carbonaceous materials, is benchmarked against the higher-level density functional theory (DFT) data, using a range of diverse structural features as the comparison criteria. Trying to reproduce the data presented here (even if you only consider a subset of the interaction potentials) will help you devise an understanding as to how you could approach a high-throughput structure prediction problem. Even though we concentrate here on isolated/finite nanoclusters, AIRSS (and other similar approaches like USPEX, CALYPSO, GMIN, etc.,) can also be used to predict crystal structures of different class of materials with applications in energy storage, catalysis, hydrogen storage, and so on.
I tried hard to make this paper as reproducible as possible, but as techniques and dependencies become more complex, it is hard to make it 100% clear. Any form of feedback is more than welcome.