Review of
"Measuring the impact of COVID-19 vaccine misinformation on vaccination intent in the UK and USA"

Review of "Measuring the impact of COVID-19 vaccine misinformation on vaccination intent in the UK and USA"

Submitted by iguellil  

Nov. 23, 2023, 4:04 p.m.

Lead reviewer

juillermo

Review team members

abulhasan iguellil

Review Body

Reproducibility

Did you manage to reproduce it?
Partially Reproducible
Reproducibility rating
How much of the paper did you manage to reproduce?
8 / 10
Briefly describe the procedure followed/tools used to reproduce it

We re-run the jupyter notebooks in a virtual environment.

Briefly describe your familiarity with the procedure/tools used by the paper.

Familiar with python, jupyter notebook, and pystan.

Which type of operating system were you working in?
Linux/FreeBSD or other Open Source Operating system
What additional software did you need to install?

All the python packages.

What software did you use

python, conda, and jupyter notebook

What were the main challenges you ran into (if any)?
  • We didn't manage to install pystan on Windows and reproduce it on that platform
  • pystan had conflicts with newer versions of python, the error was not very clear, and the repository had no information about package versions (e.g. no 'requirements.txt' file)
  • we couldn't install 'pyreadstat' in neither Windows nor Linux
What were the positive features of this approach?

The jupyter notebooks had a lot of descriptions, the code was well structured (in files, folders, etc).

Any other comments/suggestions on the reproducibility approach?

Extra notes:

-> no info about packages ('requirements.txt', environment, etc), not even the python version. We created a conda environment and installed packages as 'ModuleNotFound' errors requested them

When running 'tables_figures.ipynb' -> paper.causal_effects(tables), paper.determinants(), we get the following: /home/guillermo/Git/covid19-misinfo/src/utils.py:597: UserWarning: set_ticklabels() should only be used with a fixed number of ticks, i.e. after set_ticks() or using a FixedLocator. ax.yaxis.set_ticklabels(reversed(ticks)) findfont: Generic family 'sans-serif' not found because none of the following families were found: Arial

When installing 'pystan' with conda: """ Your python: python=3.11

If python is on the left-most side of the chain, that's the version you've asked for. When python appears to the right, that indicates that the thing on the left is somehow not available for the python version you are constrained to. Note that conda will not change your python version to a different minor version unless you explicitly specify that.

The following specifications were found to be incompatible with your system:

  • feature:/linux-64::__glibc==2.31=0
  • feature:|@/linux-64::__glibc==2.31=0
  • pystan -> libgcc-ng[version='>=7.3.0'] -> __glibc[version='>=2.17']

Your installed version is: 2.31 """ 'pystan' doesn't allow python version 3.11 (only up to 3.10.0a0), so I downgraded to python to 3.9

tables_figures.ipynb

XXX social = paper.determinants(tables, 'social', subset='Social media usage') X AttributeError: property 'name' of 'Index' object has no deleter X File ~/Git/covid19-misinfo/src/utils.py:911, in subset_df(df, atts, reset, save) X --> 911 del out.index.name This line does not work and its hard to understand what they wanted to do with it, we just commented it out

XXX image = paper.image_impact(tables) X AttributeError: property 'name' of 'Index' object has no deleter X File ~/Git/covid19-misinfo/src/paper.py:118, in image_impact..set_index(df, index) X --> 118 del df.index.name This line does not work and its hard to understand what they wanted to do with it, we just commented it out

ALL THIS NOTEBOOK HAS BEEN SUCCESSFULLY REPRODUCED

statistical_analyses.ipynb

'fit_impact_causal_self = mo.model_impact_causal(df)' -> many warnings, although they say in the notebook "This model seems to have run without any serious stan warnings about the NUTS sampler misbehaving." Same with fit_impact_causal_others = mo.model_impact_causal(df, kind='others'), fit_socdem_causal_self = mo.model_socdem(df, dd, kind='self')

'ut.plot_causal_flow(fit2stats_impact_causal_self)' -> NameError: name 'fit2stats_impact_causal_self' is not defined I assume they meant 'fit2stats_impact_causal', so I run it with that one, but I only get a white panel, no figure.

import_data.ipynb

dd = import_datadict() --> FileNotFoundError: [Errno 2] No such file or directory: './dat/orb_datadict.txt' This file doesn't exist in the repository and we didn't find any other with likely similar information.

df = import_data() installing 'pyreadstat' with conda install pyreadstat """ PackagesNotFoundError: The following packages are not available from current channels: - pyreadstat """


Documentation

Documentation rating
How well was the material documented?
6 / 10
How could the documentation be improved?
  • README is lacking package information
  • jupyter notebooks had good documentation, but functions did not
What do you like about the documentation?

The details in the jupyter notebooks.

After attempting to reproduce, how familiar do you feel with the code and methods used in the paper?
5 / 10
Any suggestions on how the analysis could be made more transparent?

Mostly because we didn't spend the time reading the details, as there seems to be enough descriptions in the notebooks.


Reusability

Reusability rating
Rate the project on reusability of the material
8 / 10
Permissive Data license included:  
Permissive Code license included:  

Any suggestions on how the project could be more reusable?
  • provide requirements of packages and their versions
  • move the import statements to the top of files to make it easy to know the packages required
  • make the access to the data in a more sustainable format (not requiring pyreadstat)


Any final comments
  • Very good use of jupyter notebooks
  • Reliability on pystan problematic (this package is not well maintained)
  • other smaller issues, but generally good