Metabolomics Society Forum

Software => R => XCMS => Topic started by: jcmartin on June 04, 2015, 03:13:46 AM

Title: unreliable peak intensity in report table
Post by: jcmartin on June 04, 2015, 03:13:46 AM
Hi there,
I'm facing an unresolved problem; we are using a qexactive + in dual mode (+/-) @30000 mass resolution. After conversion into mzXML, we discovered that some features intensity are eratically reported in the diffreport table .tsv, whereas both the EIC caption and also the raw EIC are excellent. See figures below as an illustration
I would really appreciate help since we tested many different scripts without any subtantial improvements
thank you

raw xcalibur EIC
[attachment=2:8i42r00r]xcalibur raw EIC.JPG[/attachment:8i42r00r]

XCMS EIC
[attachment=1:8i42r00r]xcms EIX.JPG[/attachment:8i42r00r]

diffreport xcms Table values (.tsv)
[attachment=0:8i42r00r]diffreport table EIC values.JPG[/attachment:8i42r00r]

Hereafter is the script we tested:
library(xcms)                           
library(Biobase)                           
library(multtest)                           
xset<-xcmsSet(method="centWave"    peakwidth=c(3   15)    snthresh=2.5    mzdiff=-0.005    ppm=2    prefilter=c(3   1500000)    noise=10000    integrate=1
xset                           
xset1<-retcor(xset    method="obiwarp"    profStep=0.1    plottype="deviation")                   
xset1                           
xset2<-group(xset1    method="density"    bw=10    mzwid=0.01    minfrac=0.75    minsamp=1)            
xset2                           
xset3<-retcor(xset2    method="obiwarp"    profStep=0.1    plottype="deviation")                   
xset3                           
xset4<-group(xset3    method="density"    bw=2    mzwid=0.01    minfrac=0.75    minsamp=1)            
xset4                           
xset5<-fillPeaks(xset4)                           
xset5                           
#Pour obtenir le rapport du résultat d'XCMS#                           
reporttab<-diffreport(xset5   ctrl   pool   StdMixSwitchPosNeg_20150414   5000   metlin=0   h=480   w=640)      
library(CAMERA)                           
diffreport <- annotateDiffreport(xset5)                           
write.csv(diffreport    file="StdMixSwitchPosNeg_20150414_diffreport.csv")

[attachment deleted by admin]
Title: Re: unreliable peak intensity in report table
Post by: Jan Stanstrup on June 05, 2015, 06:05:01 AM
Hi. First thing I would check is if the peak is detected in all samples prior to gap-filling. I never use diffreport but I think you can just use it with xset3 and see if some of the samples have zero intensity.
Title: Re: unreliable peak intensity in report table
Post by: jcmartin on June 17, 2015, 07:04:21 AM
Hi Jan,
yes I checked that the peak is detected in all samples. In fact I spiked plasma samples with 33 different standards to see whether xcms correctly extracted the corresponding ions. The peak picking step worked quite well since it detected 32 out of 33. But then the problem occured when I compared the extracted peak ions from xcalibur and the reported peak intensities area in the report table which is often eratic and does not correspond to the xcalibur reality: half of the spiked molecules had a stable peak area from one sample to the others in the xcms report table, and the other half is just unreliable, whereas the EIC xcms pictures comparatively indicate a good extraction (see my previous message above for one example).
What I shall add is that about half of the features in my report table matrix are duplicated, sometimes 6 times, which apperared to me unusual if I compared to what we used to obtain with our QTOF (almost no peak duplication).
I suspect a problem during the integration, although I also compared with the peak intensity (using the maxo argument) which did not improved our results either. Hereafter is the last script I used:

again thanks to anybody for a suggestion!

library(xcms)
library(Biobase)
library(multtest)
nslaves<- 8
xset<-xcmsSet(method="centWave", peakwidth=c(3,20), snthresh=3, mzdiff=-0.00005, ppm=2, prefilter=c(3,1500000), noise=10000, integrate=1,fitgauss=TRUE, nSlaves=nslaves)
xset
xset1<-retcor(xset, method="obiwarp", profStep=0.1, plottype="none")
xset1
xset2<-group(xset1, method="density", bw=3, mzwid=0.015, minfrac=0.75, minsamp=3)
xset2
xset3<-fillPeaks(xset2)
xset3
#Pour obtenir le rapport du résultat d'XCMS#
reporttab<-diffreport(xset3,"ctrl","pool","StdMixSwitchPosNeg_20150414",5000,metlin=0,h=480,w=640)
Title: Re: unreliable peak intensity in report table
Post by: Jan Stanstrup on June 18, 2015, 04:50:24 AM
OK I think you have several issues here.




If you need more help you probably need to provide some sample data that shows the issue.
Title: Re: unreliable peak intensity in report table
Post by: jcmartin on June 18, 2015, 09:35:42 AM
Many thanks, it will help. I already tested matchedFilter and it worked pretty well, with a good peak intensity extraction, although peak duplication is still the same, but I will try to fix this with your script. I will post the results soon.
jc
Title: Re: unreliable peak intensity in report table
Post by: jcmartin on June 29, 2015, 11:05:30 AM
Hi Jan,
sorry but I'm unable to install the packages (metshot and chemhelper), althouth I started from the beginning (https://github.com/stanstrup/chemhelper (https://github.com/stanstrup/chemhelper)). I'm using R 3.1.2. There is something wrong with Rcpp, althought I tried to installed it manually from the a dowloaded zip file. Is there something I did wrong?thanks, jc
Title: Re: unreliable peak intensity in report table
Post by: jcmartin on June 29, 2015, 11:42:13 AM
ok I fixed packages intallation but chemhelper: no way. Any other link to find it? (I did find any), thanks again
jc
Title: Re: unreliable peak intensity in report table
Post by: Jan Stanstrup on June 30, 2015, 12:03:31 PM
Sorry for the many dependencies. I really need to split that package.

For the moment the easiest way to get it working is just to copy the functions into R. Only xcms should be needed. You can find the functions here: https://github.com/stanstrup/chemhelper ... i.filter.R (https://github.com/stanstrup/chemhelper/blob/master/R/orbi.filter.R)
Title: Re: unreliable peak intensity in report table
Post by: jcmartin on July 01, 2015, 03:06:13 AM
thanks jan,
just to be sure of how it works: first I run obifilter on the mzXML files, then I process the orbifilter corrected files with xcms the usual way. Is that right? Sorry to be so basic
thanks
jc
Title: Re: unreliable peak intensity in report table
Post by: Jan Stanstrup on July 01, 2015, 09:40:53 AM
Yep that is correct. Hope it works out for you.
Title: Re: unreliable peak intensity in report table
Post by: jcmartin on July 30, 2015, 05:06:15 AM
Hi Jan,
if you do not enjoy yet of the summer sun, I can help you enjoy my orbitrap problems :). So I start running orbifilter on selected mzXML files, and obtained fixed files in the mzData format which appeared nice when viewed with mzmine for instance. The problem is that it takes very very long to process the files with    orbifilter (> 7 hrs per file!). The input mzxml file are roughly 163Mo, reduced to 46Mo after fixing and transforming in the mzdata format. My server is not a formula 1, but even if I'm using a more powerfull machine I would suspect that the processing time would still be undue.
What is your feedback on your trials?
again many thanks
jc
Title: Re: unreliable peak intensity in report table
Post by: Jan Stanstrup on July 30, 2015, 05:27:56 AM
Hi,

I think the easiest would be if you send me one of those files so I can try to understand why it should be so slow (and if the same is true on my machine). It sounds strange. The time-consuming part is simply reading and writing the file. Should not take that long! You are not trying to do it over a network or something like that that might indeed be very slow?


Jan
Title: Re: unreliable peak intensity in report table
Post by: jcmartin on July 30, 2015, 10:09:14 AM
Thanks,
I'm going yo provide you with some mzxml files I used; I also noticed that the last step crashed everything:
cl <- makeCluster(detectCores())
clusterExport(cl,c("input","xraw_orbifix"))
parLapply(cl,input,function(x) xraw_orbifix(x[1],x[2]))
stopCluster(cl)

instead we used:
for (i in 1:length(files)){
xraw_orbifix(files,outname)
}

but it's slow 'zzzzz......

I just sent you a sample files containing 4 runs in mzXML format, using an ftp service; I sent you the hypertext link at stanstrup@gmail.com, hope it's correct
many thanks

jc
Title: Re: unreliable peak intensity in report table
Post by: jcmartin on July 30, 2015, 10:13:58 AM
...and also to fully answer your question, I processed the files stored locally on the same machine I'm using the orbifilter script..
Title: Re: unreliable peak intensity in report table
Post by: Jan Stanstrup on July 31, 2015, 03:10:06 AM
OK the problem is that your files are in profile mode. That is the reason it is so slow (it runs through every m/z in each scan). The function is only meant for centroided data. Also xcms is only meant to use centroided data so that is probably the source of your troubles in the first place...
Title: Re: unreliable peak intensity in report table
Post by: jcmartin on July 31, 2015, 03:42:58 AM
mmmmhh I see. We are using proteowizard to convert the raw data into mzXML, using the peakpicking filter (I thought it convert profile data into centroided ones?), such as:

converter <- c("D:/Pwiz/msconvert.exe")
FILES <- list.files(recursive=TRUE,full.names=TRUE,pattern="\.raw")

for (i in 1:length(FILES)){system (paste(converter," --mzXML --32 --filter "peakPicking true 1" --filter "polarity negative"  -o Negative -v",FILES))}
for (i in 1:length(FILES)){
system (paste(converter," --mzXML --32 --filter "peakPicking true 1" --filter "polarity positive"  -o Positive -v",FILES))}

but obviously it's not correct

what do you think?
many thanks again
jc
Title: Re: unreliable peak intensity in report table
Post by: Jan Stanstrup on July 31, 2015, 04:29:11 AM
I don't know why that doesn't work... I tried re-converting your files and I cannot get it to centroid the data either... You'd probably have to ask the proteowizard people. Sorry.
Title: Re: unreliable peak intensity in report table
Post by: jcmartin on July 31, 2015, 07:23:10 AM
Jan
I just used another version of proteowizard than the one we previously utilized to generate the files I'm working on, and this time it worked well, converting the data into the centroid mode (12Mo instead of 163Mo!, with m/z with zero line witdhs). The good version of proteowizard to use would be 3.0.6994 and over.
I must admit that we should have looked at the size of the files generated after proteowizard,
I'll keep you posted on our continuation
by the way, I finished to succesfully install chemhelper and all dependancies
thanks anyway
jc