Skip to main content
Topic: No console messages xcmsSet (xcms v1.50.1) (Read 4242 times) previous topic - next topic

No console messages xcmsSet (xcms v1.50.1)

Hi,

It is probably something minor/trivial but since updating to xcms3 (v1.50.1) (and the deprecation of the nSlaves argument change to BPPARAM of BiocParallel) I am no longer receiving lovely reassuring progress messages printed to the R console. The strange thing is in previous versions of xcms the progress messages would appear soon after initiation of the xcmsSet function regardless of the number of mzXML files in the directory.
I have tried the progressBar argument of the SnowParam function but it was stuck at 0% for over an hour.
With the ~200 mzXML files I am currently peak-picking I did not receive any console message for over two hours:
metabForum_20170608.PNG
Then all of a sudden there were messages previously typical of a single-threaded process:
metabForum_20170608_2.PNG
However I was able to check the multi-threaded process was running by monitoring the CPU usage.

When the dataset size is small as is the case for faahKO, the progress messages appear much sooner. Here is a reproducible but perhaps not very useful example:
Code: [Select]
library(faahKO)
library(xcms)
library(BiocParallel)
library(snow)

## The directory with the NetCDF LC/MS files
cdfpath <- file.path(find.package("faahKO"), "cdf")

setwd(cdfpath)

snowparam <- SnowParam(workers = parallel::detectCores(), type = "SOCK")

peakmatrix <- xcmsSet(BPPARAM = snowparam)

Is is necessary to DIY/Jerry-rig your own progressCallBack function now?

Code: [Select]
cdffiles <- list.files(cdfpath, recursive = TRUE)
progress <- function(n) cat(paste0(n, ' of ', length(cdffiles),
                                   ' complete (', basename(cdffiles)[n],
                                   ').\n'))
peakmatrix <- xcmsSet(BPPARAM = snowparam, progressCallback = progress)
Although this didn't work as expected either.

Many thanks in advance,

Will

Code: [Select]
>sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252 
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                         
[5] LC_TIME=English_United States.1252   

attached base packages:
[1] parallel  stats    graphics  grDevices utils    datasets  methods  base   

other attached packages:
[1] snow_0.4-2          BiocParallel_1.8.1  faahKO_1.14.0      xcms_1.50.1        Biobase_2.34.0   
[6] ProtGenerics_1.6.0  BiocGenerics_0.20.0 mzR_2.8.1          Rcpp_0.12.10     

loaded via a namespace (and not attached):
[1] RANN_2.5              lattice_0.20-34        codetools_0.2-15      MASS_7.3-45         
[5] MassSpecWavelet_1.40.0 grid_3.3.2            plyr_1.8.4            stats4_3.3.2         
[9] S4Vectors_0.12.1      Matrix_1.2-8          splines_3.3.2          RColorBrewer_1.1-2   
[13] tools_3.3.2            survival_2.40-1        multtest_2.30.0     

[attachment deleted by admin]

Re: No console messages xcmsSet (xcms v1.50.1)

Reply #1
Dear Will,

with the switch to BiocParallel the progress information are no longer printed immediately - this seems to have to do with the way BiocParallel handles the sub-processes. I know that is annoying, but there is not much we can do within xcms.

I'll get in contact with the BiocParallel developers to check if we can fix that.

cheers, jo

Re: No console messages xcmsSet (xcms v1.50.1)

Reply #2
Got now an explanation from Martin Morgan (https://support.bioconductor.org/p/96856/). Basically, you could use the progressbar, but you have to increase the number of tasks, so that the progress bar will be updated more frequently. Note however that a) the number of tasks should not be larger than the number of files you're processing and b) there might be a performance decrease with too many tasks.
So, in your case you could:
Code: [Select]
library(faahKO)
library(xcms)
library(BiocParallel)
library(snow)

## The directory with the NetCDF LC/MS files
cdfpath <- file.path(find.package("faahKO"), "cdf")

setwd(cdfpath)

## Register the parallel processing setting - will be used by default by all xcms methods
## Set tasks to a reasonable number
register(SnowParam(tasks = 10, progressbar = TRUE))

peakmatrix <- xcmsSet()

Now, for your 400 file experiment you might want to increase the number of tasks to get more frequent callbacks and updates of the progress bar.

Hope this helps.

cheers, jo

Re: No console messages xcmsSet (xcms v1.50.1)

Reply #3
Fantastic, thanks for getting back to me so quickly. I read Martin Morgan's explanation also and it is now clear to me (although probably quite superficially)  how BiocParallel is working.   :))