I downloaded and installed RStudio to replicate the figures.
I also set up a project in Renku as an independent check to ensure the difficulties I had were not due to my own system.
Not very familiar. I have used R in the past but not regularly. Fortunately it is easy to understand.
I installed RStudio (version 2024.04.2+764) and used Rscript version 3.4.0.
I also had to install the corrplot
package through RStudio.
Then I had to install the seqLogo
package using the instructions here: https://bioconductor.org/packages/release/bioc/html/seqLogo.html
R Studio
Renku
Figure 2: the code references just the dm6 matrices and not the hg19 matrices, so when I ran the fig2.R script three of the six plots were incorrect.
Figure 5 in the paper does not correspond with Fig5 in the repository. The correlation plot is the same as the one generated in Fig4, and there are two additional bar plots created that do not appear to be in the paper.
In the clusteringAlgorithms directory the driver.sh script, specifically the seq2mono.pl calls, never complete. They are taking 0 cpu and 0 memory according to a top
inspection, but apart from creating an empty .dat output file nothing appears to be happening. I have left it running for several hours both on my system and on Renku with no result.
Generally well documented repository, with helpful README files at each level.
Test the scripts provided and check the output corresponds with the paper figures.
If the long clustering scripts can be run in parallel please indicate this or provide a --threads
or --cpus
option.
It is very good in general. The top-level README could include more information on what order to run the various scripts and how the subdirectories tie together.
I liked the modular approach to documentation, where each subfolder had its own README with instructions on how to reproduce just the data in the subfolder.
I don't use R, but I could understand the R scripts. The Perl scripts are harder to understand for someone who doesn't use Perl. Perhaps explain what the perl commands are doing.
Nice repository structure. Clearly a lot of effort has been put into making the work reproducible. I think it needs a little more testing to iron out the remaining issues.