Hey there,
I had already a couple of weeks ago some issues with getEIC() (see http://http://www.metabolomics-forum.com/viewtopic.php?f=8&t=384) but since my new problem has nothing to do with that I opened a new topic.
I have an xcmsSet (centWave peak-picked, orbiwarp RT-corrected, grouped, and peaks filled) where two peak groups have a very low m/z difference around 0.0441 and same peak shapes but different abundances and apeces:
> groups( xset )[98:99, ]
mzmed mzmin mzmax rtmed rtmin rtmax npeaks B_Cal D_Cal
[1,] 115.0867 115.0867 115.0868 64.23 63.65 64.75 6 3 3
[2,] 115.1308 115.1293 115.1309 64.62 64.61 64.81 6 3 3
> pks <- peaks( xset )[ unlist( xset@groupidx[ 98:99 ] ), c(1:6,9) ]
> pks
mz mzmin mzmax rt rtmin rtmax maxo
[1,] 115.0868 115.0864 115.0870 63.66 52.62 75.72 631219.562
[2,] 115.0867 115.0861 115.0870 64.75 52.65 75.22 505789.812
[3,] 115.0867 115.0864 115.0868 64.62 52.62 75.66 433946.500
[4,] 115.0868 115.0864 115.0869 63.65 48.54 78.77 513818.281
[5,] 115.0868 115.0864 115.0869 63.84 53.44 75.68 453958.312
[6,] 115.0867 115.0866 115.0868 64.62 52.56 75.66 397969.844
[7,] 115.1308 115.1306 115.1310 64.62 59.64 68.64 19231.410
[8,] 115.1309 115.1303 115.1311 64.75 54.72 71.49 14101.018
[9,] 115.1293 115.1292 115.1294 64.62 59.64 69.66 9610.525
[10,] 115.1309 115.1308 115.1310 64.61 60.59 68.64 15007.605
[11,] 115.1308 115.1307 115.1309 64.81 59.08 69.64 12008.001
[12,] 115.1308 115.1306 115.1309 64.62 59.64 69.66 10699.456
However, getEIC seams obviously to not resolve those peaks:
plot( getEIC( xset, group = 98 ) )
plot( getEIC( xset, group = 99 ) )
[attachment=1:24zorgyy]eics.png[/attachment:24zorgyy]
Even if I specify the mzrange explicitly the mass traces are not resolved:
plot( getEIC( xset, mzrange= pks[ 1:6, 2:3 ] , rtrange = pks[ 1:6, 5:6 ] ) )
plot( getEIC( xset, mzrange= pks[ 7:12, 2:3 ] , rtrange = pks[ 1:6, 5:6 ] ) )
[attachment=0:24zorgyy]eic2.png[/attachment:24zorgyy]
Is it something I am doing wrong or is getEIC just supposed to work like this?
Many thanks in advance,
Isam
[attachment deleted by admin]
well, what kind of data is this, i.e. what instrument did you acquire on? And what does it look like if you look at the same data using vendor software?
meow,
good point: Data are acquired on an HPLC-qTOF instrument (scan-to-scan accuracy around 30ppm). If we look at the data with the vendor software (Agilent MassHunter) we can clearly see that these are two distinct peaks, whereas the first has around 20-30 times higher intensity (as is correctly calculated by xcms). Most probably, the second peak is a result of instrumental detector ringing. I have attached the raw chromatographic data, extracted manually from an xcmsRaw:
[attachment=0:3rs63afe]raw.png[/attachment:3rs63afe]
However, getEIC( xcmsSet, group ) as well as getEIC( xcmsSet, mzrange ) collapse the two (distinctly detected) mass traces to one and my question is, how this can be avoided?
Many thanks,
Isam
[attachment deleted by admin]
As you noticed in your other posting, the step size ("step"), i.e. the bin size of the profile matrix has an influence on the result.
Which "step" size did you use here ?
It should be small enough so that these features won't fall into the same bin.
An alternative would be to implement a "rawEIC" method for xcmsSet, i.e. not to use the profile matrix but the full raw data (rawEIC for xmsRaw) for EIC generation.
I think this might be nice to have anyway, especially in combination with centWave, so I'll add this to my personal to-do list.
Ralf
Sorry Ralf,
obviously I misunderstood it again. Since I applied centWave, I never specified a step size and did not expect that it is set and used internally. Looking at the implementation of getEIC( xcmsSet ) I just saw what happens.
My concern is also how this influences the subsequent workflow in CAMERA. I could imagine that groupCorr makes vast usage of getEIC.
I have done this already halfway. Should I submit it somewhere after polishing?
Cheers,
Isam
As a follow-up:
For the implementation of the "rawEIC"-Function for xcmsSet I am using the mz-ranges given in the peak table. However, I just realized that mzmin and mzmax do not specify the borders of my peaks exactly. findmzROI finds the mz-range perfectly, but later in the implementation of findPeaks.centWave the mz-range is narrowed depending on the found scale. Maybe someone could explain why this is done?
The problem with that is, that many EICs look like pretty disrupted zigzag curves when signals within the peak have an m/z outside of this range. So how can I determine (or at least approximate) the true mz-range of the peaks from the peak table?
Many thanks
Isam
A ROI can contain multiple features. centWave makes an effort to assign both retention time and m/z range precisely for each individual feature found within one ROI. The m/z center of a feature is calculated using the m/z values of the centroids within this range, so it is trying to get that m/z range as narrow as possible.
It is possible that there is a problem with numeric precision (http://http://www.parashift.com/c++-faq/floating-point-arith.html) when you try to extract the EICs using the exact boundary values. you might want to try adding a little bit of slack like +/- 0.0001. But please let me know if you should find an example where some centroids that you think belong to a feature are located outside the given m/z range (by the way, the equivalent to rawEIC is rawMZ for the m/z domain, e.g. mz <- rawMZ(object,mzrange=mzrange,scanrange=scrange) )
Hey Ralf,
thanks for your reply. In most cases adding +/- 0.0001 is sufficient to recover the peak. But there are still cases were it's necessary to make the m/z window wider to extract the centroids belonging to the peak. I just can give you two examples:
[attachment=1:27tmo29s]pk2040.png[/attachment:27tmo29s] +/- 3E-4 needed with
mz mzmin mzmax rt rtmin rtmax into intb maxo sn egauss mu sigma h f dppm scale scpos scmin scmax lmin lmax
162.0765 162.0762 162.0769 76.68 44.58 107.07 1002001 970706.5 29823.78 55 NA NA NA NA 3961 1 10 75 65 85 43 105
[attachment=0:27tmo29s]pk2089.png[/attachment:27tmo29s] more than +/- 25E-4 needed with
mz mzmin mzmax rt rtmin rtmax into intb maxo sn egauss mu sigma h f dppm scale scpos scmin scmax lmin lmax
854.1456 854.1447 854.1467 146.85 133.55 171.06 494611.6 475216.1 19721.88 39 NA NA NA NA 4135 1 6 140 134 146 127 171
As long as I need to extract the ion chromatograms just for plotting, it's fine and I know how to deal with it. But my biggest concern is, if any of this observations made in this thread do affect the generation of pseudospectra in CAMERA in a negative way?
Thanks, Isam
[attachment deleted by admin]
These peaks look weird. At this relatively high intensity they should have a better shape. Is this HILIC or RP data? What kind of mass spec is this?
And what are your centWave parameters (ppm) ?
Only the core region of your features seem to be within within that 30 (?) ppm window, it looks pretty wild outside of that core region.
You don't want these m/z values to be included into the m/z calculation, but for a smooth looking EIC you might want to open up the m/z window.
CAMERA uses getEIC for EIC extraction, so the results will depend on the profstep parameter of your xcmsSet.
Hey Ralf,
this is HILIC under HPLC conditions. The MS data are recorded by an Agilent qTOF/MS 6540. Currently, I call centWave with
ppm = 35, peakwidth = c(12, 300), snthresh = 10, prefilter = c(0,0), mzCenterFun = "wMean", integrate = 2, mzdiff = 0.001, fitgauss = FALSE, noise = 1000
I know that those peak shapes are not really looking nice, but I'd estimate that 80% of my peaks look like that or even worse. However, the peaks themselves are pretty good extracted.
The core region of the first peak has a m/z width of 4 ppm, the entire peak after widening by +/- 0.0003 Da has a width of 8 ppm. The core region of the second peak has a width of only 2 ppm but after manually widening by +/- 0.0025 Dalton the width is still 8 ppm. Plus, I plotted exactly the RT range as indicated in the peak table. So even if the peaks are not really nice looking, why wouldn't I want to have the entire peak extracted and integrated? Please let me know for what else I should look / I should provide to better understand what is going wrong.
Ok! Which brings me again to the starting point in this and the other thread: I do not specify a profstep (which means the default of 0.1 is applied). To which value should I set the profstep then? How reliable is the peak correlation in CAMERA against the background we discussed already in the other thread?