ReproHack Hub

Review of "The viewing angle in AGN SED models, a data-driven analysis"

Submitted by japhir

Nov. 18, 2021, 3:52 p.m.

Lead reviewer

japhir

Review team members

sheebasamuel igordub harrychown

Review Body

Reproducibility

Did you manage to reproduce it?

Partially Reproducible

Reproducibility rating

How much of the paper did you manage to reproduce?

9 / 10

Briefly describe the procedure followed/tools used to reproduce it

First we ran the code via the included binder. Then we tried to delete rawdata and download them and re-create figures from scratch. We also tried to install the package locally.

Briefly describe your familiarity with the procedure/tools used by the paper.

Different for each reviewer

Ilja's not familiar with python or jupyter. Transferrable skills from R, org-mode, GNU/Linux.
Harry: Faniliar with python, a little with jupyter, zero experience with physics modules
Sheeba familiar with python, Jupyter, and Binder but zero experience with physics modules
Igor: familiar with Python and Jupyter notebook, but not Binder and used physics Python packages.

Which type of operating system were you working in?

Linux/FreeBSD or other Open Source Operating system

What additional software did you need to install?

To run locally, we needed to install:

git
jupyterlab
anaconda
and then install the environment packages with conda env create -f environment.yml

What software did you use

See above.

What were the main challenges you ran into (if any)?

Notebook 5 is intentionally missing, but this was unclear from the binder. Lack of master script linking notebook code. Unable to reproduce data without raw (deleted original data and tried to reproduce from pulldown)

What were the positive features of this approach?

Great use of the binder framework for sharing the code and documentation, we loved the pre-rendered html files. Setting it up as a binder that doesn't require the user to install anything locally was a great way to make the analysis quickly reproducible for everyone! The fact that the files were neatly split into separate analysis chunks was useful, and the step-by-step file provided a great overview linking everything together.

Any other comments/suggestions on the reproducibility approach?

We didn't find the binder button immediately, so it was not obvious that we could inspect everything in an online pre-configured environment. There were no guidelines on how to run the analysis locally. Which commands do you need to run to install the dependencies via the envornment.yml file? How do I start the jupyter lab after installing the dependencies locally?

Documentation

Documentation rating

How well was the material documented?

8 / 10

How could the documentation be improved?

Figuring out which codebook does what, and why. The README nicely describes all the tools that were used, but did not include a big-picture description of what problem the project is solving. The step-by-step was a nice file, but lacked the more general introduction as well.

It would be nice to 0-pad the notebook filenames so that they are sorted correctly by default.
Adding a master file that runs the different codebooks would be nice.
It could perhaps be merged with the step-by-step file.
Furthermore, this would make it more clear that the code for step 5 is not in this repo. It would also be possible to add it to the repo, but set all the code chunks to not evaluate, or to provide a warning that running this part of the analysis requires HPC.
Each notebook could provide more details on the goal and background of the notebook.
It was not immediately clear how to run the environment (we found the binder button only after some time). The hyperlink in the text points to binder's homepage.

What do you like about the documentation?

Documentation was good, they've shown how each script should be pieced together with a brief description of each step

We enjoyed the step-by-step document, the way the analysis is split into separate notebooks with a single task. The use of autopep8 in the notebooks for formatting the code is good.
We like the inclusion of rendered html files and intermediate output, which allowed us to quickly inspect everything without needing to run the chunks.

After attempting to reproduce, how familiar do you feel with the code and methods used in the paper?

7 / 10

Any suggestions on how the analysis could be made more transparent?

Very transparent paper/code, no corners have been cut or hidden from the public view

Reusability

Reusability rating

Rate the project on reusability of the material

2 / 10

Permissive Data license included:

Permissive Code license included:

Any suggestions on how the project could be more reusable?

Subdirectories inside Data/{Raw,Interim,Final,Complementary} directories, e.g. Data/Interim/CIGALEOutputs, must be created while the sripts are running, in case user deletes them.
Hard-coded filenames with dates Notebook 2_Clean_Sample.ipynb: Cell 3
Some galaxies are empty causing value errors, if-statements should be applied to prevent this Notebook 2_Clean_Sample.ipynb: Cell 15
Switch to semantic versioning
Add a Changelog

Any final comments

We'd like to thank the authors for putting so much effort into making their workflow reproducible! They went above and beyond, and the comments above are mostly nitpicks that prevent the workflow from being fully reproducible from scratch.