Skip to main content
Topic: unreliable peak intensity in report table (Read 14435 times) previous topic - next topic

unreliable peak intensity in report table

Hi there,
I'm facing an unresolved problem; we are using a qexactive + in dual mode (+/-) @30000 mass resolution. After conversion into mzXML, we discovered that some features intensity are eratically reported in the diffreport table .tsv, whereas both the EIC caption and also the raw EIC are excellent. See figures below as an illustration
I would really appreciate help since we tested many different scripts without any subtantial improvements
thank you

raw xcalibur EIC
[attachment=2:8i42r00r]xcalibur raw EIC.JPG[/attachment:8i42r00r]

XCMS EIC
[attachment=1:8i42r00r]xcms EIX.JPG[/attachment:8i42r00r]

diffreport xcms Table values (.tsv)
[attachment=0:8i42r00r]diffreport table EIC values.JPG[/attachment:8i42r00r]

Hereafter is the script we tested:
library(xcms)                           
library(Biobase)                           
library(multtest)                           
xset<-xcmsSet(method="centWave"    peakwidth=c(3   15)    snthresh=2.5    mzdiff=-0.005    ppm=2    prefilter=c(3   1500000)    noise=10000    integrate=1
xset                           
xset1<-retcor(xset    method="obiwarp"    profStep=0.1    plottype="deviation")                   
xset1                           
xset2<-group(xset1    method="density"    bw=10    mzwid=0.01    minfrac=0.75    minsamp=1)            
xset2                           
xset3<-retcor(xset2    method="obiwarp"    profStep=0.1    plottype="deviation")                   
xset3                           
xset4<-group(xset3    method="density"    bw=2    mzwid=0.01    minfrac=0.75    minsamp=1)            
xset4                           
xset5<-fillPeaks(xset4)                           
xset5                           
#Pour obtenir le rapport du résultat d'XCMS#                           
reporttab<-diffreport(xset5   ctrl   pool   StdMixSwitchPosNeg_20150414   5000   metlin=0   h=480   w=640)      
library(CAMERA)                           
diffreport <- annotateDiffreport(xset5)                           
write.csv(diffreport    file="StdMixSwitchPosNeg_20150414_diffreport.csv")

[attachment deleted by admin]

Re: unreliable peak intensity in report table

Reply #1
Hi. First thing I would check is if the peak is detected in all samples prior to gap-filling. I never use diffreport but I think you can just use it with xset3 and see if some of the samples have zero intensity.
Blog: stanstrup.github.io

Re: unreliable peak intensity in report table

Reply #2
Hi Jan,
yes I checked that the peak is detected in all samples. In fact I spiked plasma samples with 33 different standards to see whether xcms correctly extracted the corresponding ions. The peak picking step worked quite well since it detected 32 out of 33. But then the problem occured when I compared the extracted peak ions from xcalibur and the reported peak intensities area in the report table which is often eratic and does not correspond to the xcalibur reality: half of the spiked molecules had a stable peak area from one sample to the others in the xcms report table, and the other half is just unreliable, whereas the EIC xcms pictures comparatively indicate a good extraction (see my previous message above for one example).
What I shall add is that about half of the features in my report table matrix are duplicated, sometimes 6 times, which apperared to me unusual if I compared to what we used to obtain with our QTOF (almost no peak duplication).
I suspect a problem during the integration, although I also compared with the peak intensity (using the maxo argument) which did not improved our results either. Hereafter is the last script I used:

again thanks to anybody for a suggestion!

library(xcms)
library(Biobase)
library(multtest)
nslaves<- 8
xset<-xcmsSet(method="centWave", peakwidth=c(3,20), snthresh=3, mzdiff=-0.00005, ppm=2, prefilter=c(3,1500000), noise=10000, integrate=1,fitgauss=TRUE, nSlaves=nslaves)
xset
xset1<-retcor(xset, method="obiwarp", profStep=0.1, plottype="none")
xset1
xset2<-group(xset1, method="density", bw=3, mzwid=0.015, minfrac=0.75, minsamp=3)
xset2
xset3<-fillPeaks(xset2)
xset3
#Pour obtenir le rapport du résultat d'XCMS#
reporttab<-diffreport(xset3,"ctrl","pool","StdMixSwitchPosNeg_20150414",5000,metlin=0,h=480,w=640)

Re: unreliable peak intensity in report table

Reply #3
OK I think you have several issues here.


  • The duplicated peaks. I believe that this is due to orbitrap shoulder/satellite peaks. If you look in the mass spectra of large peaks and zoom in heavily you should see small mass peaks on both sides of the real mass peak. This is characteristic of orbitrap and is an artifact of the Fourier transformation as far as I have been able to figure out. This means that these small peaks are picked too.
    To get useful results you just need to filter out these phantom peaks. I have written a function to do that, xcmsRaw.orbifilter. You can find it in my package here: https://github.com/stanstrup/chemhelper.
    Here is an example of how to use it to create a set of "fixed" files in parallel that you can use with xcms.
    Code: [Select]
    library(xcms)
    library(chemhelper)
    library(parallel)


    files <- c(
                list.files("mzXML_POS_FS",pattern=".mzXML",full.names = T),
                list.files("mzXML_NEG_FS",pattern=".mzXML",full.names = T)
    )


    # make files and outdir into a list
    outnames <- sub(".mzXML", ".mzData", files)
    outnames <- sub("_FS", "_FS_fixed", outnames)

    input <- cbind(files,outnames)
    input <- split(input, 1:NROW(input))





    xraw_orbifix <- function(file,outname){
    require(xcms)
    require(chemhelper)
     
    xraw <- xcmsRaw(file,profstep = 0)

    xraw_out <- xcmsRaw.orbifilter(xraw,
                      windows_width=0.2,
                      max_rel_int = 0.2,
                      keep_isotopes=TRUE,
                      max_charge=5,
                      isotope_mz_tol = 0.005)


    write.mzdata(xraw_out,filename=outname)


    }





    #now make a cluster and convert in parallel
    cl <- makeCluster(detectCores())
    clusterExport(cl,c("input","xraw_orbifix")          )
    parLapply(cl,input,function(x) xraw_orbifix(x[1],x[2]))
    stopCluster(cl)

  • If the above doesn't alone fix your problem I see a couple of things that could be the cause. It is probably either the peak picking or the grouping.
    I had some samples the other day where I could not get centwave to give reasonable results. So you might also want to try matchedFilter instead. ppm=2 also seems low, even for orbitrap (remember that it needs to be true even for the ends of your peak).
    It could also be an issue with group. I find my function analyze.xcms.group from the above package useful in understanding such problems (example of the output here).
     


If you need more help you probably need to provide some sample data that shows the issue.
Blog: stanstrup.github.io

Re: unreliable peak intensity in report table

Reply #4
Many thanks, it will help. I already tested matchedFilter and it worked pretty well, with a good peak intensity extraction, although peak duplication is still the same, but I will try to fix this with your script. I will post the results soon.
jc

Re: unreliable peak intensity in report table

Reply #5
Hi Jan,
sorry but I'm unable to install the packages (metshot and chemhelper), althouth I started from the beginning (https://github.com/stanstrup/chemhelper). I'm using R 3.1.2. There is something wrong with Rcpp, althought I tried to installed it manually from the a dowloaded zip file. Is there something I did wrong?thanks, jc

Re: unreliable peak intensity in report table

Reply #6
ok I fixed packages intallation but chemhelper: no way. Any other link to find it? (I did find any), thanks again
jc

Re: unreliable peak intensity in report table

Reply #7
Sorry for the many dependencies. I really need to split that package.

For the moment the easiest way to get it working is just to copy the functions into R. Only xcms should be needed. You can find the functions here: https://github.com/stanstrup/chemhelper ... i.filter.R
Blog: stanstrup.github.io

Re: unreliable peak intensity in report table

Reply #8
thanks jan,
just to be sure of how it works: first I run obifilter on the mzXML files, then I process the orbifilter corrected files with xcms the usual way. Is that right? Sorry to be so basic
thanks
jc

Re: unreliable peak intensity in report table

Reply #9
Yep that is correct. Hope it works out for you.
Blog: stanstrup.github.io

Re: unreliable peak intensity in report table

Reply #10
Hi Jan,
if you do not enjoy yet of the summer sun, I can help you enjoy my orbitrap problems :). So I start running orbifilter on selected mzXML files, and obtained fixed files in the mzData format which appeared nice when viewed with mzmine for instance. The problem is that it takes very very long to process the files with    orbifilter (> 7 hrs per file!). The input mzxml file are roughly 163Mo, reduced to 46Mo after fixing and transforming in the mzdata format. My server is not a formula 1, but even if I'm using a more powerfull machine I would suspect that the processing time would still be undue.
What is your feedback on your trials?
again many thanks
jc

Re: unreliable peak intensity in report table

Reply #11
Hi,

I think the easiest would be if you send me one of those files so I can try to understand why it should be so slow (and if the same is true on my machine). It sounds strange. The time-consuming part is simply reading and writing the file. Should not take that long! You are not trying to do it over a network or something like that that might indeed be very slow?


Jan
Blog: stanstrup.github.io

Re: unreliable peak intensity in report table

Reply #12
Thanks,
I'm going yo provide you with some mzxml files I used; I also noticed that the last step crashed everything:
cl <- makeCluster(detectCores())
clusterExport(cl,c("input","xraw_orbifix"))
parLapply(cl,input,function(x) xraw_orbifix(x[1],x[2]))
stopCluster(cl)

instead we used:
for (i in 1:length(files)){
xraw_orbifix(files,outname)
}

but it's slow 'zzzzz......

I just sent you a sample files containing 4 runs in mzXML format, using an ftp service; I sent you the hypertext link at stanstrup@gmail.com, hope it's correct
many thanks

jc

Re: unreliable peak intensity in report table

Reply #13
...and also to fully answer your question, I processed the files stored locally on the same machine I'm using the orbifilter script..

Re: unreliable peak intensity in report table

Reply #14
OK the problem is that your files are in profile mode. That is the reason it is so slow (it runs through every m/z in each scan). The function is only meant for centroided data. Also xcms is only meant to use centroided data so that is probably the source of your troubles in the first place...
Blog: stanstrup.github.io