In theory, reproducing this paper should only require a clone of a public Git repository, and the execution of a Makefile (detailed in the README of the paper repository at https://github.com/psychoinformatics-de/paper-remodnav). We've set up our paper to be dynamically generated, retrieving and installing the relevant data and software automatically, and we've even created a tutorial about it, so that others can reuse the same setup for their work. Nevertheless, we've for example never tried it out across different operating systems - who knows whether it works on Windows? We'd love to share the tips and tricks we found to work, and even more love feedback on how to improve this further.
Even though the approach in the paper focuses on a specific measurement (clumped isotopes) and how to optimize which and how many standards we use, I hope that the problem is general enough that insight can translate to any kind of measurement that relies on machine calibration. I've committed to writing a literate program (plain text interspersed with code chunks) to explain what is going on and to make the simulations one step at a time. I really hope that this is understandable to future collaborators and scientists in my field, but I have not had any code review internally and I also didn't receive any feedback on it from the reviewers. I would love to see if what in my mind represents "reproducible code" is actually reproducible, and to learn what I can improve for future projects!
Most of the material is available through Jupyter notebooks in GitHub, and it should be easy to reproduce with the help of Binder. With the notebooks, you could experiment with different parameters to the ones analyzed in the paper. It also contains a large dataset of physical parameters of galaxies analysed in this work. We expect this work to be easily reproducible in the steps described in the repository.
Because: - Two fellow PhDs working on different topics have been able to reproduce some figures by following the README instructions and I hope this extends to other people - I've tried to incorporate as many of the best practices as possible to make my code and data open and accessible - I've tried to make sure that my data is exactly reproducible with the specified random seed strategy - the paper suggests a method that should be useful to other researchers in my field, which is not useful unless my results are reproducible
Basic analyses, which are easy to understand and reproduce + the paper contains multiple imputation, which can be interesting; ALL materials are available
It was a null findings paper that disappointed many people. Could I have made a mistake in the coding?; I'm interested in using it as an example of reproducible research and learning from ReproHack. It's nerve wracking to submit for inspection from others so I also want to overcome that fear and be able to lead my students by example. I'll be available via the Slack group or other forms for communication as suggested by organisers. Please note it's only the gene expression and related data that's available on ArrayExpress.
Metadata annotation is key to reproducibility in sequencing experiments. Reproducing this research using the scripts provided will also show the current level of annotation in years since 2015 when the paper was published.
This is perhaps an interesting 'meta' example for ReproHack as in this study we attempted to reproduce analyses reporrted in 25 published articles. So it seems even more important that our own analyses are reproducible! We tried our best to adhere to best practices in this regard, so we would be very keen to know if anyone has problems reproducing our analyses and/or learning how we can make the process easier. A couple of things to note: 1. In addition to the links to the data and analysis scripts provided above, we also have a Code Ocean container for this article (https://doi.org/10.24433/CO.1796004.v3), which should theoretically allow you to reproduce the analyses with the click of a single button (we hope!). 2. In addition to the main research analyses (for which I've provided links above), we also have data, scripts, and Code Ocean containers for each of the reprodubility attempts for the 25 articles we looked at. I don't know if you will also want to look at this level of the analyses, but if you do then take a look at Supplementary Information section E here: https://royalsocietypublishing.org/doi/suppl/10.1098/rsos.201494 For each reproducibility attempt, there is a short 'vignette' describing the outcome, and a link to data/scripts on the OSF and a Code Ocean container.
1. Because it contains customized numerical methods to implement analytical solutions for an engineering problem relevant to cryogenic storage. This will become increasingly relevant in the future with the increase in the use of liquid hydrogen and LNG as fuel. 2. The storage tank is implemented as a Class and there is an opportunity to understand the object oriented programming mindset of the authors. 3. In the provided Jupyter Notebook, thermodynamic data for nitrogen and methane are provided which enable the users the quick implementation. 4. To reproduce some of the figures and results, the storage tanks need to be modified with inputs available in the paper.
Some may argue that the field of machine learning is in a reproducibility crisis. It will be interesting to know how difficult it is for others to reproduce the results of a paper that proposed a quite complex methodology.
The current code is written in Torch, which is no longer actively maintained. Since deep learning in nanophotonics is an area of active interest (e.g. for the design of new metamaterials), it is important to update the code to use a more modern deep learning library such as tensorflow/keras
I suggested a few papers last year. I’m hoping that we’ve improved our reproducibility with this one, this year. We’ve done our best to package it up both in Docker and as an R package. I’d be curious to know what the best way to reproduce it is found to be. Working through vignettes or spinning up a Docker instance. Which is the preferred method?
It is kind of an easy reproducible code. It reads the data, makes few descriptive statistical analysis and plots figures using ggplot2.
Cleaning the databases used for this study was one of the most challenging aspects of it, so making it public is the best way to make the more out of it. We made an effort to document all analyses and data wrangling steps. We are interested to know if it is truly reproducible so that we can follow this same scheme for further projects, or adjust accordingly.
To use data from a manufacturing process: RTM for carbon composite production.To see if you can handle large amounts of data: the 36 k injection runs contain a total of 5 m frames. Maybe it is possible for you to reach our performance on smaller parts of the data, which would be great.
The paper describes pyKNEEr, a python package for open and reproducible research on femoral knee cartilage using Jupyter notebooks as a user interface. I created this paper with the specific intent to make both the workflows it describes and the paper itself open and reproducible, following guidelines from authorities in the field. Therefore, two things in the paper can be reproduced: 1) workflow results: Table 2 contains links to all the Jupyter notebooks used to calculate the results. Computations are long and might require a server, so if you want to run them locally, I recommend using only 2 or 3 images as inputs for the computations. Also, the paper should be sufficient, but if you need further introductory info, there are a documentation website: https://sbonaretti.github.io/pyKNEEr/ and a "how to" video: https://youtu.be/7WPf5KFtYi8 2) paper graphs: In the captions of figures 1, 4, and 5 you can find links to data repository, code (a Jupyter notebook), and the computational environment (binder) to fully reproduce the graph. These computations can be easily run locally and require a few seconds. All Jupyter notebooks automatically download data from Zenodo and provide dependencies, which should make reproducibility easier.
Paper and codes+data have been published 4 years ago, will they still work? I always try to release data and codes to reproduce my papers, but I seldom receive feedback. It would be useful to have comments from a reproducers' team, in order to improve sharing for future research (I switched from MATLAB to Python already).
This paper provides a novel approach to identifying oncogenes based on RNA overexpression in subsets of tumor relative to adjacent normal tissue. Showing that this study can be reproduced would aid other researchers who are attempting to identify oncogenes in other cancer types using the same methodology.
It'll a great helpful to independently check the scientific record I've published, so that errors, if there are any, could be corrected. Also, I will learn how to share the data in a more accessible to other if you could give me feedback.
I tried hard to make this paper as reproducible as possible, but as techniques and dependencies become more complex, it is hard to make it 100% clear. Any form of feedback is more than welcome.