Papers



Submit a Paper!

Browse ReproHack papers

  • Neurodesk: an accessible, flexible and portable data analysis environment for reproducible neuroimaging

    Authors: Angela I. Renton, Thuy T. Dao, Tom Johnstone, Oren Civier, Ryan P. Sullivan, David J. White, Paris Lyons, Benjamin M. Slade, David F. Abbott, Toluwani J. Amos, Saskia Bollmann, Andy Botting, Megan E. J. Campbell, Jeryn Chang, Thomas G. Close, Monika Dörig, Korbinian Eckstein, Gary F. Egan, Stefanie Evas, Guillaume Flandin, Kelly G. Garner, Marta I. Garrido, Satrajit S. Ghosh, Martin Grignard, Yaroslav O. Halchenko, Anthony J. Hannan, Anibal S. Heinsfeld, Laurentius Huber, Matthew E. Hughes, Jakub R. Kaczmarzyk, Lars Kasper, Levin Kuhlmann, Kexin Lou, Yorguin-Jose Mantilla-Ramos, Jason B. Mattingley, Michael L. Meier, Jo Morris, Akshaiy Narayanan, Franco Pestilli, Aina Puce, Fernanda L. Ribeiro, Nigel C. Rogasch, Chris Rorden, Mark M. Schira, Thomas B. Shaw, Paul F. Sowman, Gershon Spitz, Ashley W. Stewart, Xincheng Ye, Judy D. Zhu, Aswin Narayanan & Steffen Bollmann
    DOI: https://doi.org/10.1038/s41592-023-02145-x
    Submitted by sbollmann    
      Mean reproducibility score:   2.5/10   |   Number of reviews:   2
    Why should we attempt to reproduce this paper?

    We invested a lot of work to make the analyses from the paper reproducible and we are very curious how the documentation could be improved and if people run into any problems.

  • A multi-level analysis of data quality for formal software citation

    Authors: David Schindler, Tazin Hossain, Sascha Spors, Frank Krüger
    DOI: https://doi.org/10.48550/arXiv.2306.17535
    Submitted by frank.krueger    
      Mean reproducibility score:   9.0/10   |   Number of reviews:   2
    Why should we attempt to reproduce this paper?

    We spend a lot of time to make our analyses reproducible. A review would allow us to collect some information on whether we are successful with it.

  • Living HTA: Automating Health Technology Assessment with R

    Authors: Robert A. Smith, Paul P. Schneider, Wael Mohammed
    DOI: 10.12688/wellcomeopenres.17933.1
    Submitted by rasmith3    

    Why should we attempt to reproduce this paper?

    We think this is an interesting paper for anyone who wants to learn to build an API with the R package plumber. This is a novel method in health economics, but we believe will help improve the transparency of modelling methods in our field.

  • Droplet impact onto a spring-supported plate: analysis and simulations

    Authors: Michael J. Negus, Matthew R. Moore, James M. Oliver, Radu Cimpeanu
    DOI: https://doi.org/10.1007/s10665-021-10107-5
    Submitted by MNegus      
      Mean reproducibility score:   8.0/10   |   Number of reviews:   1
    Why should we attempt to reproduce this paper?

    The direct numerical simulations (DNS) for this paper were conducted using Basilisk (http://basilisk.fr/). As Basilisk is a free software program written in C, it can be readily installed on any Linux machine, and it should be straightforward to then run the driver code to re-produce the DNS from this paper. Given this, the numerical solutions presented in this paper are a result of many high-fidelity simulations, which each took approximately 24 CPU hours running between 4 to 8 cores. Hence the difficulty in reproducing the results should mainly be in the amount of computational resources it would take, so HPC resources will be required. The DNS in this paper were used to validate the presented analytical solutions, as well as extend the results to a longer timescale. Reproducing these numerical results will build confidence in these results, ensuring that they are independent of the system architecture they were produced on.

  • Accelerating the prediction of large carbon clusters via structure search: Evaluation of machine-learning and classical potentials

    Authors: Bora Karasulu, Jean-Marc Leyssale, Patrick Rowe, Cedric Weber, Carla de Tomas
    DOI: 10.1016/j.carbon.2022.01.031
    Submitted by bkarasulu    
    Number of reviews:   1
    Why should we attempt to reproduce this paper?

    This paper presents a fine example of high-throughput computational materials screening studies, mainly focusing on the carbon nanoclusters of different sizes. In the paper, a set of diverse empirical and machine-learned interatomic potentials, which are commonly used to simulate carbonaceous materials, is benchmarked against the higher-level density functional theory (DFT) data, using a range of diverse structural features as the comparison criteria. Trying to reproduce the data presented here (even if you only consider a subset of the interaction potentials) will help you devise an understanding as to how you could approach a high-throughput structure prediction problem. Even though we concentrate here on isolated/finite nanoclusters, AIRSS (and other similar approaches like USPEX, CALYPSO, GMIN, etc.,) can also be used to predict crystal structures of different class of materials with applications in energy storage, catalysis, hydrogen storage, and so on.

  • Automatic learning of hydrogen-bond fixes in an AMBER RNA force field

    Authors: Thorben Fröhlking, Vojtěch Mlýnský, Michal Janeček, Petra Kührová, Miroslav Krepl, Pavel Banáš, Jiří Šponer, Giovanni Bussi
    Submitted by giovannibussi      

    Why should we attempt to reproduce this paper?

    We do care about reproducibility. In case we receive any feedback, we would be really happy to improve our Github repository and/or submitted manuscript so as to make the reproduction easier!

  • Molecular Dynamics of Solids at Constant Pressure and Stress Using Anisotropic Stochastic Cell Rescaling

    Authors: Vittorio Del Tatto, Paolo Raiteri, Mattia Bernetti, Giovanni Bussi
    DOI: 10.3390/app12031139
    Submitted by giovannibussi      

    Why should we attempt to reproduce this paper?

    We do care about reproducibility. In case we receive any feedback, we would be really happy to improve our Github repository so as to make the reproduction easier!

  • Synergistic coupling in ab initio-machine learning simulations of dislocations

    Authors: Petr Grigorev, Alexandra M. Goryaeva, Mihai-Cosmin Marinica, James R. Kermode, Thomas D. Swinburnea
    DOI: https://arxiv.org/abs/2111.11262
    Submitted by jameskermode      

    Why should we attempt to reproduce this paper?

    Systematically improvable machine learning potentials could have a significant impact on the range of properties that can be modelled, but the toolchain associated with using them presents a barrier to entry for new users. Attempting to reproduce some of our results will help us improve the accessibility of the approach.

  • Sensitivity and dimensionality of atomic environment representations used for machine learning interatomic potentials

    Authors: Berk Onat, Christoph Ortner and James Kermode
    DOI: 10.1063/5.0016005
    Submitted by jameskermode      

    Why should we attempt to reproduce this paper?

    Popular descriptors for machine learning potentials such as the Behler-Parinello atom centred symmetry functions (ACSF) or the Smooth Overlap of Interatomic Potentials (SOAP) are widely used but so far not much attention has been paid to optimising how many descriptor components need to be included to give good results.

  • Encapsulated Nanowires: Boosting Electronic Transport in Carbon Nanotubes

    Authors: Andrij Vasylenko, Jamie Wynn, Paulo Medeiros, Andrew J Morris, Jeremy Sloan, David Quigley
    DOI: 10.1103/PhysRevB.95.121408
    Submitted by dquigley      
      Mean reproducibility score:   5.0/10   |   Number of reviews:   2
    Why should we attempt to reproduce this paper?

    DFT calculations are in principle reproducible between different codes, but differences can arise due to poor choice of convergence tolerances, inappropriate use of pseudopotentials and other numerical considerations. An independent validation of the key quantities needed to compute electrical conductivity would be valuable. In this case we have published our input files for calculating the four quantities needed to parametrise the transport simulations from which we compute the electrical conductivity. These are specifically electronic band structure, phonon dispersions, electron-phonon coupling constants and third derivatives of the force constants. Each in turn in more sensitive to convergence tolerances than the last, and it is the final quantity on which the conclusions of the paper critically depend. Reference output data is provided for comparison at the data URL below. We note that the pristine CNT results (dark red line) in figure 3 are an independent reproduction of earlier work and so we are confident the Boltzmann transport simulations are reproducible. The calculated inputs to these from DFT (in the case of Be encapsulation) have not been independently reproduced to our knowledge.

  • New Insight into the Stability of CaCO3 Surfaces and Nanoparticles via Molecular Simulation

    Authors: A. Matthew Bano, P. Mark Rodger, and David Quigley
    DOI: 10.1021/la501409j
    Submitted by dquigley      

    Why should we attempt to reproduce this paper?

    The negative surface enthalpies in figure 5 are surprising. At least one group has attempted to reproduce these using a different code and obtained positive enthalpies. This was attributed to the inability of that code to independently relax the three simulation cell vectors resulting in an unphysical water density. This demonstrates how sensitive these results can be to the particular implementation of simulation algorithms in different codes. Similarly the force field used is now very popular. Its functional form and full set of parameters can be found in the literature. However differences in how different simulation codes implement truncation, electrostatics etc can lead to significant difference in results such as these. It would be a valuable exercise to establish if exactly the same force field as that used here can be reproduced from only its specification in the literature. The interfacial energies of interest should be reproducible with simulations on modest numbers of processors (a few dozen) with run times for each being 1-2 days. Each surface is an independent calculation and so these can be run concurrently during the ReproHack.

  • Thermodynamics of stacking disorder in ice nuclei

    Authors: David Quigley
    DOI: 10.1063/1.4896376
    Submitted by dquigley      
      Mean reproducibility score:   3.0/10   |   Number of reviews:   1
    Why should we attempt to reproduce this paper?

    The results of this paper have been used in multiple subsequent studies as a benchmark against which other methods of performing the same calculation have been tested. Other groups have challenged the results as suffering from finite size effects, in particular the calculations on mixtures of cubic and hexagonal ice. Should there be time during in the event, participants could check this by performing calculations on larger unit cells. Each individual calculation should converge adequately within 96 hours making it amenable to a HPC ReproHack. Given modern HPC hardware many such calculations could be run concurrently on a single HPC node.

  • The viewing angle in AGN SED models, a data-driven analysis

    Authors: Andrés Felipe Ramos Padilla, Lingyu Wang, Katarzyna Małek, Andreas Efstathiou, Guang Yang
    Submitted by aframosp    
      Mean reproducibility score:   9.0/10   |   Number of reviews:   1
    Why should we attempt to reproduce this paper?

    Most of the material is available through Jupyter notebooks in GitHub, and it should be easy to reproduce with the help of Binder. With the notebooks, you could experiment with different parameters to the ones analyzed in the paper. It also contains a large dataset of physical parameters of galaxies analysed in this work. We expect this work to be easily reproducible in the steps described in the repository.

  • Finding Efficient Trade-offs in Multi-Fidelity Response Surface Modeling

    Authors: Sander van Rijn, Sebastian Schmitt, Matthijs van Leeuwen, Thomas Bäck
    Submitted by sjvrijn    
      Mean reproducibility score:   9.0/10   |   Number of reviews:   1
    Why should we attempt to reproduce this paper?

    Because: - Two fellow PhDs working on different topics have been able to reproduce some figures by following the README instructions and I hope this extends to other people - I've tried to incorporate as many of the best practices as possible to make my code and data open and accessible - I've tried to make sure that my data is exactly reproducible with the specified random seed strategy - the paper suggests a method that should be useful to other researchers in my field, which is not useful unless my results are reproducible

  • Algorithm configuration data mining for CMA evolution strategies

    Authors: Sander van Rijn, Hao Wang, Bas van Stein, Thomas Bäck
    DOI: 10.1145/3071178.3071205
    Submitted by sjvrijn    
      Mean reproducibility score:   10.0/10   |   Number of reviews:   1
    Why should we attempt to reproduce this paper?

    The original data took quite a while to produce for a previous paper, but for this paper, all tables and figures should be exactly reproducible by simply running the jupyter notebook.

Search for papers

Filter by tags

Python R GDAL GEOS GIS Shiny PROJ Galaxies Astronomy HPC Databases Binder Social Science Stata make Computer Science Jupyter Notebook tidyverse emacs literate earth sciences clumped isotopes org-mode geology eyetracking LaTeX Git ArcGIS Docker Drake SVN knitr C Matlab Mathematica Meta-analysis swig miniconda tensorflow keras Pandas SQL neuroscience robotics deep learning planner reiforcement learning Plasma physics Hybrid-PIC EPOCH Laser Gamma-ray X-ray radiation Petawatt Fortran plasma PIC physics Monte Carlo Atomistic Simulation LAMMPS Electron Transport DFT descriptors interatomic potentials machine learning Molecular Dynamics Python scripting AIRSS structure prediction density functional theory high-throughput machine-learning RNA bioinformatics CFD Fluid Dynamics OpenFOAM C++ DNS Mathematics Droplets Basilisk Particle-In-Cell psychology Stan Finance SAS Replication crisis Economics Malaria consumer behavior number estimation mental arithmetic psychophysics Archaeology Precipitation Epidemiology Parkrun Health Health Economics HTA plumber science of science Zipf networks city size distribution urbanism literature review Preference Visual Questionnaire Mann-Whitney Correlation Conceptual replication Cognitive psychology Multinomial processing tree (MPT) modeling #urbanism #R k-means cluster analysis city-regions Urban Knowledge Systems Topic modelling Planning Support Systems Software Citation Quarto snakemake Numerical modelling Ocean climate physical oceanography apptainer oceanography R package structural equation modeling bayes factor Forest Simulations Models of forest dynamics multi-lab study mice mechanics growth Tissue Cells Clustering Expectation-Maximization bootstrapping R software Position Weight Matrices singularity neuroimaging effect size biology replicability cancer reproducibility csv osf preclinical research genomics All tags Clear tags

Key

  Associated with an event
  Available for general review
  Public reviews welcome