Review of
"PlanGAN: Model-based Planning With Sparse Rewards and Multiple Goals"

Review of "PlanGAN: Model-based Planning With Sparse Rewards and Multiple Goals"

Submitted by gmontana74  

April 26, 2022, 12:20 p.m.

Lead reviewer

AlexBeesonWarwick

Review team members

direland3 gmontana74 claudia.viaro@warwick.ac.uk MIANCHU

Review Body

Reproducibility

Did you manage to reproduce it?
Fully Reproducible
Reproducibility rating
How much of the paper did you manage to reproduce?
10 / 10
Briefly describe the procedure followed/tools used to reproduce it

Our plan was to (a) reproduce results locally using a single-GPU (b) reproduce results locally using multiple-GPUs (c) reproduce results on Sulis using single and multiple-GPUs.

We began by locally installing the necessary software/modules as per the authors’ GitHub. We started with a fresh virtual environment to mimic a new user experience. We then ran the training files with the authors’ recommended hyperparameters and reproduced the results of the paper.

As PlanGAN is an ensemble approach, a natural progression is to train models in parallel using multiple-GPUs (as opposed to sequentially using a single-GPU). We attempted to modify the vanilla PlanGAN code to utilise two-GPUs and PyTorch’s multiprocessing functionality but were unsuccessful. Our main issue stemmed from synchronising training across the two GPU devices (PyTorch requires tensors to be on the same device when performing operations).

Finally, we attempted to replicate the procedure outlined in (a) on Sulis but were unsuccessful. The main issues here were (i) accessing Sulis and (ii) installing the module MuJoCo (see “main challenges” for more details).

Briefly describe your familiarity with the procedure/tools used by the paper.

In our Team we had three levels of experience (a) Complete experience with the methods in the paper and the exact procedure/tools used

(b) Some experience with the methods in the paper and general experience with procedure/tools (i.e use of PyTorch/MuJoCo but not PlanGAN specific code)

(c) No experience with the methods in the paper or the procedure/tools used

Which type of operating system were you working in?
Linux/FreeBSD or other Open Source Operating system
What additional software did you need to install?

As per the authors’ GitHub, we required the following:

• Mujoco-py

• PyTorch

• NumPy

• SkLearn

• Joblib

What software did you use

All of the above.

The programming language was python (v3.8)

What were the main challenges you ran into (if any)?

In terms of reproducing results locally we only ran into one minor issue, which we believe stems from updates to PyTorch (see “final comments” for specifics)

In terms of reproducing results using Sulis, our main challenges were (a) accessing Sulis off campus and (b) installing MuJoCo on Sulis.
All but one member of the Team had issues accessing Sulis, which we later found out (too late unfortunately) was because Sulis cannot be accessed off-campus using Warwick VPN and instead requires a person to go through Godzilla with an SCRTP account.

As for MuJoCo, despite installing the module successfully we encountered numerous common errors when trying to import environments/tasks and unfortunately the solutions posted on GitHub (of which there are many and we only had time to try a handful) did not work for us.

What were the positive features of this approach?

Assuming the requirements can be installed and used (in particular MuJoCo), it is very straightforward to clone the GitHub repository and run the code.

Any other comments/suggestions on the reproducibility approach?

Documentation

Documentation rating
How well was the material documented?
7 / 10
How could the documentation be improved?

Adding a .txt file that details the versions of modules used

Adding links to the “Requirements” listed in the README.md

Adding train.py files with hyperparameters already set for tasks (to avoid having to manually enter these).

Adding more comments to the code

What do you like about the documentation?

File and folder structure on GitHub is logical

Instructions for running code (i.e. training agents) is clear

After attempting to reproduce, how familiar do you feel with the code and methods used in the paper?
10 / 10
Any suggestions on how the analysis could be made more transparent?

Reusability

Reusability rating
Rate the project on reusability of the material
10 / 10
Permissive Data license included:  
Permissive Code license included:  

Any suggestions on how the project could be more reusable?


Any final comments

Following on from “main challenges”, here is the minor bug and fix

In the networks.py file, in the TrajGeneratorFC class the line:

states[..., 9:] = states[..., 9:] + state_noise*torch.randn_like(states[..., 9:])

Causes an in-place error. The fix is to replace as follows:

avoid_inplace_operation = torch.randn_like(states) avoid_inplace_operation[..., :9] = 0 states = states + state_noise*avoid_inplace_operation