Skip to main content
Topic: Centroiding of profile-mode DDA data, MS2-level (Read 5068 times) previous topic - next topic

Centroiding of profile-mode DDA data, MS2-level

Hello developpers/maintainers,

I would like to hear your advice on how to treat DDA (Data Dependent Aquisition) experiments, which were quired in profile-mode.
It is clear from the very nice vignette (https://github.com/jorainer/metabolomics2018) on how to do this for MS1-level, but the question is on how to proceed the MS2-level. Should the MS2-level be centroided also and how can this be achieved including serialisation via MSnbase/xcms?

Thanks for your help
Kind regards
Tony




Re: Centroiding of profile-mode DDA data, MS2-level

Reply #1
I'm no expert in MS2 data analysis but I'd say that also MS2 data should be centroided. Note also that if you do the centroiding with the pickPeaks function from MSnbase you will by default centroid MS1 and MS2 spectra in your object.

Re: Centroiding of profile-mode DDA data, MS2-level

Reply #2
Hello johannes.rainer,

thanks for the advise and help. I tried the following code snippet for both the pure MS1-runs only and then the same for DDA-runs:
 
  setwd(pathInput)
  raw_file <- readMSData(files = file, pdata = NULL, mslevel. = NULL, centroided. = FALSE, smoothed. = FALSE, mode = "onDisk", verbose = TRUE)
  cent_file <- pickPeaks(smooth(raw_file, method = "SavitzkyGolay", halfWindowSize = 3),
                         refineMz = "descendPeak", signalPercentage = 33)
  setwd(pathOutput)
  writeMSData(object = cent_file, outformat = "mzml", file = file, copy = TRUE, merge = FALSE, verbose = TRUE)
 
 
While this code snippet works well for all MS1-runs with no errors, all the DDA-runs stop with the following error:

Writing 1 mzml file.
Saving file xyzDDA.mzML...Error: fun(object@intensity, halfWindowSize = halfWindowSize, ...) : ‘halfWindowSize’ is too large!
In addition: There were 50 or more warnings (use warnings() to see the first 50)
1: In smooth_Spectrum(x, method = match.arg(method), halfWindowSize = halfWindowSize,  ... :
  Negative intensities generated. Replaced by zeros.
 
Peak widths are on average 6 seconds, when playing around with other halfWindowSize, than it is either too large or either too small (going from 2,3,4,5,6). Reading only MS1-level via readMSData(msLevel.=1, ...) would loose the entire MS2-level data completely after serialisation ...

Would there be a way for DDA-runs to centroid only the MS1-level and disregad the MS2-level? Could this be a workaround?

thank you
kind regards
Tony

 

Re: Centroiding of profile-mode DDA data, MS2-level

Reply #3
Hi Tony,

so far there is no possibility to do the peak picking (centroiding) separately for each MS level (or to do that specifically on a single MS level). I've added an issue in MSnbase (https://github.com/lgatto/MSnbase/issues/478) and will work on that.

Meanwhile, could you please check if you can do the peak picking at all in MS2 (i.e. read only MS2 data, or use filterMsLevel to restrict the data to MS level 2 only and call pickPeaks on that data)?

cheers, jo

Re: Centroiding of profile-mode DDA data, MS2-level

Reply #4
Hi Johannes,

many thanks for your support.

I adopted as you suggested to:

setwd(pathInput)
raw_file <- readMSData(files = file, pdata = NULL, mslevel. = NULL, centroided. = FALSE, smoothed. = FALSE, mode = "onDisk", verbose = TRUE)

Reading 7205 spectra from file xyz.mzML
object.size(raw_file)
2683024 bytes

raw_file <-  filterMsLevel(object = raw_file, msLevel. = 2)
object.size(raw_file)
1110312 bytes

cent_file <- pickPeaks(smooth(raw_file, method = "SavitzkyGolay", halfWindowSize = 2),    # halfWindowSize lowered to 2
                  refineMz = "descendPeak", signalPercentage = 33)
object.size(cent_file)
1115880 bytes   # (why?)

setwd(pathOutput)
writeMSData(object = cent_file, outformat = "mzml", file = file, copy = TRUE, merge = FALSE, verbose = TRUE)

Writing 1 mzml file.
Saving file xz.mzML...Error: fun(object@intensity, halfWindowSize = halfWindowSize, ...) : ‘halfWindowSize’ is too large!


If I further even lower to halfWindowSize = 1, I get:
Writing 1 mzml file.
Saving file xyz.mzML...Error in solve.default(t(X) %*% X) :
system is computationally singular: reciprocal condition number = 1.19379e-18

Not an expert: May it be meaningful to do the smoothing only, if the number of scans/data points in m/z is sufficiently large with regards to halfWindowSize? In case there are no sufficient points available, just simply skip the centroiding (leave it as it is) and work only with those MS2 data, which are really amenable to centroiding?

thanks
Tony

Re: Centroiding of profile-mode DDA data, MS2-level

Reply #5
Thanks for the feedback - indeed we might have to check the code again. I've never centroided MS2 data so far.

The good news is that we've added the msLevel parameter to the pickPeaks method. This means you could call pickPeaks on your object with msLevel. = 1L to perform the centroiding only on MS 1 and keep the MS2 spectra as they are.

To use the new functionality you would however have to switch to the current Bioconductor developmental version:

Code: [Select]
library(BiocManager)
BiocManager::install(version = "3.10")
BiocManager::install()

devtools::install_github("lgatto/MSnbase")

cheers, jo

Re: Centroiding of profile-mode DDA data, MS2-level

Reply #6
Hi Johannes,

thumbs up. That is indeed a good solution so far.

thanks
Tony

Re: Centroiding of profile-mode DDA data, MS2-level

Reply #7
Hello johannes.rainer/developers,

I was now able to switch to the recent BioC devel (3.10), installed all dev-packages and retested the centroiding of the DDA samples again, with:

>   packageVersion("BiocManager")
[1] ‘1.30.4’
>   packageVersion("MSnbase")
[1] ‘2.11.6’
>   packageVersion("xcms")
[1] ‘3.7.2’

R-Version: 3.6.1, 64 bit, Windows

and used the new parameter "msLevel." with

cent_file <- pickPeaks(smooth(raw_file, method = "SavitzkyGolay", halfWindowSize = 3),
                       refineMz = "descendPeak", signalPercentage = 33, msLevel. = 1)
                 
but still getting:

Writing 1 mzml file.
Saving file xyz.mzML...Error: fun(object@intensity, halfWindowSize = halfWindowSize, ...) : ‘halfWindowSize’ is too large!
In addition: There were 50 or more warnings (use warnings() to see the first 50)
> warnings()
1: In smooth_Spectrum(x, method = match.arg(method), halfWindowSize = halfWindowSize,  ... :
  Negative intensities generated. Replaced by zeros.
2: In smooth_Spectrum(x, method = match.arg(method), halfWindowSize = halfWindowSize,  ... :
  Negative intensities generated. Replaced by zeros.
3: In smooth_Spectrum(x, method = match.arg(method), halfWindowSize = halfWindowSize,  ... :

It seems that the alternative to restrict to msLevel. = 1 is yet (so far) unfortunately not a solution to centroid the DDA runs on MS1 with disregarded MS2. The instrument is a ABSciex Triple ToF 5600.

Thanks again for the work done on this so far and maybe suggestions/workarounds on how to cope with this DDA runs in the future.
                 
kind regards
Tony

Re: Centroiding of profile-mode DDA data, MS2-level

Reply #8
Hm, that's interesting. Could be that the error comes from smooth, not from pickPeaks. Could you try your code again with only pickPeaks (i.e. skip the smooth step)?

If so, we'll have to add the msLevel. parameter also to the smooth method -

cheers, jo

 

Re: Centroiding of profile-mode DDA data, MS2-level

Reply #9
Hi,

sure, I tried:
cent_file <- pickPeaks(x = raw_file, refineMz = "descendPeak", signalPercentage = 33, msLevel. = 1, verbose = TRUE)

I receive:
Error in (function (classes, fdef, mtable)  :
  unable to find an inherited method for function ‘pickPeaks’ for signature ‘"missing"’

the raw file is a DDA-sample.
MSn experiment data ("OnDiskMSnExp")
Object size in memory: 2.67 Mb
- - - Spectra data - - -
 MS level(s): 1 2
 Number of spectra: 7205
 MSn retention times: 0:0 - 21:60 minutes
- - - Processing information - - -
Data loaded [Thu Sep 19 13:36:33 2019]
 MSnbase version: 2.11.6
- - - Meta data  - - -
phenoData
  rowNames:   xyz.mzML
  varLabels: sampleNames
  varMetadata: labelDescription
Loaded from:
  xyz.mzML
protocolData: none
featureData
  featureNames: F1.S0001 F1.S0002 ... F1.S7205 (7205 total)
  fvarLabels: fileIdx spIdx ... spectrum (35 total)
  fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'

kind regards
Tony