ReproHack Hub

Review of "A multiscale Bayesian inference approach to analyzing subdiffusion in particle trajectories"

Submitted by sjvrijn

Nov. 18, 2021, 11:38 p.m.

Lead reviewer

sjvrijn

Review Body

Reproducibility

Did you manage to reproduce it?

Not Reproducible

Reproducibility rating

How much of the paper did you manage to reproduce?

3 / 10

Briefly describe the procedure followed/tools used to reproduce it

Installation:

$ wget https://zenodo.org/record/162171/files/bayesian_inference_fbm.ap
$ python -m venv venv
$ source venv/bin/activate      # Ubuntu
$ source venv/Scripts/activate  # Windows
(venv)$ pip install numpy matplotlib h5py tempdir activepapers.py
[ output omitted ]
(venv)$ aptool ls
code/generate_report
code/inference_convergence
code/lipid_analysis_l=10
code/lipid_analysis_l=100
code/lipid_analysis_l=200
code/lipid_analysis_l=50
code/python-packages/fbm
...

After this, I effectively got stuck.

The extraction worked, but resulted in a number of skips:

$ aptool checkout
Skipping /code/python-packages/literate_python: data type reference not extractable
Skipping /code/python-packages/unit_tests: data type reference not extractable
Skipping /data/long_time_trajectory: data type reference not extractable
Skipping /data/parameters/alpha_grid: data type data not extractable
Skipping /data/parameters/alpha_in: data type data not extractable
Skipping /data/parameters/n_traj_convergence: data type data not extractable
Skipping /data/parameters/n_traj_ml_estimate: data type data not extractable
Skipping /data/parameters/trajectory_lengths: data type data not extractable
Skipping /data/short_time_trajectory: data type reference not extractable

On Ubuntu 20.04 (in WSL), I was able to update some parameters in code/set_parameters.py, and have it do something:

$ aptool checkout code
$ vim code/set_parameters.py  # make changes
$ aptool checkin code
$ aptool update -v
Dataset /data/parameters/alpha_in is stale or dummy, running /code/set_parameters
Dataset /documentation/inference_convergence/convergence_fbm_l=100.pdf is stale or dummy, running /code/inference_convergence
Dataset /documentation/short_time_modification/subsampling_convergence_l=100.txt is stale or dummy, running /code/short_time_modification

I don't know what this changed/updated and was unable to successfully perform any other aptool run's

Briefly describe your familiarity with the procedure/tools used by the paper.

I am very familiar with Python in general, but had never heard of 'ActivePapers.py' before

Which type of operating system were you working in?

Windows Operating System

What additional software did you need to install?

Only had to pip install ActivePapers.py + its requirements, no further software needed (apart from Windows Subsystem for Linux)

What software did you use

Python 3.9

What were the main challenges you ran into (if any)?

On Windows, all attempts at aptool run <script> failed with import errors. It seems the ActivePapers tool runs into problems with importing the local module files from the code/python-packages folder. This cost me a lot of time in trying workarounds before finally switching to Ubuntu (wsl).

The literate_python and unit_tests files in code/python-packages/ seem to be missing. As noted earlier, they are skipped during checkout and only show up as empty files. This caused an error when trying to aptool run any of the lipid_analysis_l={10,50,100,200} files:

$ aptool run lipid_analysis_l=10
Traceback (most recent call last):
  File "<bayesian_inference_fbm.ap>:/code/lipid_analysis_l=10", line 5, in <module>
...
  File "<bayesian_inference_fbm.ap>:/code/python-packages/gaussian_processes", line 12, in <module>
ModuleNotFoundError: No module named 'unit_tests'

What were the positive features of this approach?

I really like the idea of having this kind of all-in-one file with logs, reproducibility and dependency checks built-in!

The aptool commands are not difficult and are quick to learn.

Any other comments/suggestions on the reproducibility approach?

My main suggestion would be better documentation. At least a local README along the .ap file in Zenodo with basic instructions. For example:

Installation of ActivePapers
Which scripts to run for which figures
Basic aptool commands on
- how to (re)run everything
- how/where to change parameters and run relevant scripts

Finally, the documentation of the ActivePapers tool itself should also be improved, but since it seems to be a (partially) abandoned project, I understand that is unlikely to happen.

Documentation

Documentation rating

How well was the material documented?

3 / 10

How could the documentation be improved?

Documentation could be better:

The original website http://www.activepapers.org/ is no longer valid, but has been replaced by https://activepapers.github.io/
Installation instructions on this website were incomplete: matplotlib was not listed as a requirement, but was needed to run any aptool command
Understanding how to use ActivePapers was difficult: only available documentation was a tutorial https://activepapers.github.io/python-edition/tutorial.html and the aptool --help output.

What do you like about the documentation?

Separate installation instructions and a basic tutorial using a public example were actually available, pleasantly written and were indeed helpful in getting started.

After attempting to reproduce, how familiar do you feel with the code and methods used in the paper?

3 / 10

Any suggestions on how the analysis could be made more transparent?

Reusability

Reusability rating

Rate the project on reusability of the material

5 / 10

Permissive Data license included:

Permissive Code license included:

Any suggestions on how the project could be more reusable?

The previously suggested addition of a README, and checking the .ap file for the missing code files.

Any final comments

I agree with the Author that it is an interesting test case in longevity. I feel it got very close to being an easily reproducible paper, despite the limited documentation and age of the used ActivePapers tool.