Review of
"Supercurrent-induced Majorana bound states in a planar geometry"

Review of "Supercurrent-induced Majorana bound states in a planar geometry"

Submitted by l.sieben  

June 25, 2024, 2:44 p.m.

Lead reviewer

l.sieben

Review team members

FourGreenFields

Review Body

Reproducibility

Did you manage to reproduce it?
Partially Reproducible
Reproducibility rating
How much of the paper did you manage to reproduce?
7 / 10
Briefly describe the procedure followed/tools used to reproduce it

For the plots:

  1. Setup up and activate according to README: conda env create -f environment.yml && conda activate phasemajoranas
  2. Started jupyter server: jupyter notebook
  3. Ran all cells of paper-figures.ipynb

For generating the data:

  1. Loaded environment like above
  2. Deleted files in data/
  3. Tried running paper-figures.ipynb, but failed, because of missing dependencies . Installed hpc05 and pinned the version of ipyparallel to 6.2.4, because higher versions created errors and this was the latest version at the time of the release: conda install ipyparallel=6.2.4 hpc05
  4. Manipulated the notebook (cell 2) to fit the parameters of the cluster (different working folder, needed SLURM as a scheduler)
  5. Running on the cluster failed for unknown reasons, locally (without SLURM) the computations ran with the above modifications, but were stopped at cell 14 because of the long runtime.
Briefly describe your familiarity with the procedure/tools used by the paper.

One of us has worked extensively with python, conda and jupyter notebook as well as ipyparallel to do research in Physics (Complex Systems). None of us have expertise in using SLURM or running software on the university cluster. There was no expertise for using kwant or adaptive.

Which type of operating system were you working in?
Linux/FreeBSD or other Open Source Operating system
What additional software did you need to install?
  • Python
  • Miniconda/Anaconda
  • Conda packages not mentioned in the environment.yml:
  • hpc05, ipyparallel=6.2.4
What software did you use
  • Jupyter Notebooks in Visual Studio Code/Firefox
  • Miniconda/Anaconda
What were the main challenges you ran into (if any)?
  • Installing python and conda
  • Versions of subdepencies were not pinned in the environment.yml
  • We encountered a MatplotlibDeprecationWarning which might have been solved by this
  • Bugfixing for generating the data:
  • Installing the additional hp05 package, which was not included in the environment.yml
  • Finding the correct version of ipyparallel that is compatible with hpc05
  • Running the code on the cluster:
  • Missing instructions of how to use the code on a cluster (or to generate data in general)
  • Integrating the cluster code with the SLURM scheduler
What were the positive features of this approach?
  • Using conda provides an easy-to-set-up environment.
  • The repository was well structured with folders.
  • The Juypter notebook structures the code and allows to only runs parts of it, which makes it easier to debug or just reproduce parts of it.
  • A focus on using FOSS software helps with the reproducibility.
  • Providing the data output of the simulation allowed to reproduce and understand the plots without needing access to a large computing cluster.
  • The hpc05 library seems like an ergonomic, easy-to-understand way to run simulations from a Jupyter notebook on a cluster.
Any other comments/suggestions on the reproducibility approach?
  • It would have been useful to provide information on how to run on the cluster.
  • Parameters of running on the cluster would have been useful: Expected runtime, required/recommended hardware
  • The complete environment should have been exported into the environment.yml using conda env export
  • The jupyter notebooks could have included the provenance information (e.g. errors, warnings, runtimes etc.)
  • .p as a file extension for python pickle files is ambiguous, .pkl or .pickle would have been clearer
  • Pickle files include security concerns because they enable remote code execution:

The pickle module is not secure. Only unpickle data you trust. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling. Never unpickle data that could have come from an untrusted source, or that could have been tampered with.

From: https://docs.python.org/3/library/pickle.html


Documentation

Documentation rating
How well was the material documented?
5 / 10
How could the documentation be improved?
  • Document how the the computations can be run on the cluster, what needs to be changed in the code, how long they would take, etc.
  • Explain in the Jupyter Notebook what each of the calculations do. You can also provide Formulas for the calculations or at least link to specific sections of the paper.
  • Document what each file in the data folder contains, the way it is structured and how it could be reused.
  • Link paper in documentation, not just in metadata.
What do you like about the documentation?
  • It is concise and sufficient to reproduce the plots from the paper
  • It explains what each of the scripts do and how the code is structured and what results can be expected, which helps give an immediate overview
After attempting to reproduce, how familiar do you feel with the code and methods used in the paper?
7 / 10
Any suggestions on how the analysis could be made more transparent?
  • None, other than what we said about documentation

Reusability

Reusability rating
Rate the project on reusability of the material
6 / 10
Permissive Data license included:  
Permissive Code license included:  

Any suggestions on how the project could be more reusable?
  • Explicit statement about license of data generated by code.
  • Data and code probably not meant to be stored together, as zenodo appears to only allow one resource-type (such as "dataset" and "software").
  • Define data schema to aid in understanding the data, and potentially writing other programs using the data.


Any final comments

Thanks for providing your paper to the ReproHack Project! It was a fun, if not easy, read. Overall, you did a good job at trying to make the paper more reproducible. Generating the plots was very easy, but the test on the cluster showed, that you didn’t test the code you submitted to zenodo enough before publishing. This little bit of work could be worth it, though, to allow other to profit even more from your research. To add to that, you could’ve provided more metadata on your zenodo entry, in particular a longer description and some keywords. This would allow others to find your code without needing to find your paper first. Also, to show your readers that they could really easily recreate your results on their own machines, you should place a reference to your code and data more prominently in the paper, e.g. in a separate section at the end/in the appendix. On the first read, we missed the reference completely.