Skip to main content
Topic: Peak filtering (Read 7565 times) previous topic - next topic

Peak filtering

Is there some way to turn off the automatic filtering wherein XCMS removes any peaks that are present in less than half of the samples in any group? If it's possible, I'd like to get ALL the peaks detected, even if they were only detected once. Is that possible?

Thanks!
Laura S.

 

Re: Peak filtering

Reply #1
That can be adjusted in the group function with following parameters:

Code: [Select]
minfrac	
minimum fraction of samples necessary in at least one of the sample groups for it to be a valid group

minsamp
minimum number of samples necessary in at least one of the sample groups for it to be a valid group
Blog: stanstrup.github.io

Re: Peak filtering

Reply #2
Thank you! That worked. The code I used was:
group(xset, method="mzClust", minsamp=1, mzppm=5)

Laura

Re: Peak filtering

Reply #3
Just out of curiosity... What kind of data do you have? Any particular reason to use mzClust? That is meant for single spectrum data. Is that what you have?
Blog: stanstrup.github.io

Re: Peak filtering

Reply #4
Hi, Jan.

If by "single spectrum" data you mean single MS (as opposed to tandem MS) data, then yes, that's what I have. Sorry, I'm not familiar with the term "single spectrum".  :)  To collect my metabolomics data, I'm using an Agilent 6520 QToF in which the first mass analyzer is not selective and all the accurate mass info comes from the ToF. Would you recommend using the "density" or "nearest" methods instead of "mzClust" for this? I'm new to XCMS and I want to make sure that I'm using the appropriate parameters. It would not surprise me if I'm making some inappropriate parameter choices because somewhere along the way, I've messed up something with my retention times. Even though the retention-time-correction plot looks fine, all my RT in the output are "-1". I had thought my mistake was some where other than in the group function, though. 

I really appreciate your help!

Laura

Re: Peak filtering

Reply #5
I should have included the code I'm using. Sorry. Here it is. My data are in mzData format.

Code: [Select]
Samples <- list.files(getwd(), pattern="mzdata.xml", full.names=FALSE, recursive=TRUE)
QCset <- xcmsSet(Samples, method = "centWave", snthresh = 10,
              ppm=5, peakwidth=c(10,30),
              mzCenterFun="wMean",integrate = 1,fitgauss= TRUE)

QCset

QCset.mzClust <- group(QCset, method="mzClust", minsamp=1, mzppm=5)
QCset.mzClust.2 <- retcor(QCset.mzClust, missing=0, extra=1, smooth="loess",
                          family="symmetric", plottype="mdevden")
QCset.mzClust.3 <- group(QCset.mzClust.2, method="mzClust", minsamp=1, mzppm=5)
QCset.mzClust.4 <- fillPeaks(QCset.mzClust.3)
QCset.5c1c <- diffreport(QCset.mzClust.4, "20110211", "20110223", "QCurine XCMS 5c1c", 20)

Laura

Re: Peak filtering

Reply #6
I believe the mzClust is intended for data where you directly infused your sample to the MS; so without any chromatography. I could imagine that is why retention times gets messed up. I would just use the default "density".
A ppm of 5 as you are using seems quite low. At least my instrument does not perform that good reliably.
Blog: stanstrup.github.io

Re: Peak filtering

Reply #7
Oh! Well, that would explain it! And I suppose that 5 ppm is a bit overly optimistic. That's on the better end of our resolution. We probably get more like 5-10 ppm for most mass features but sometimes lower. I tried again, this time using the density function. Here's the code I used:

Code: [Select]
QCset5 <- xcmsSet(Samples, method = "centWave", snthresh = 10, 
                  ppm=15, peakwidth=c(6,18),
                  mzCenterFun="wMean",integrate = 1,fitgauss= TRUE)

QCset5

QCset5.grouped <- group(QCset5, method="density", minsamp=1, mzwid=0.004)
QCset5.RTcor <- retcor(QCset5.grouped, missing=0, extra=1, smooth="loess",
                          family="symmetric", plottype="mdevden")
QCset5.grouped.2 <- group(QCset5.RTcor, method="density", minsamp=1, mzwid=0.004)
QCset5.filledpeaks <- fillPeaks(QCset5.grouped.2)
QCset.5c1e <- diffreport(QCset5.filledpeaks, "20110211", "20110223", "QCurine XCMS 5c1e", 20)

And it appears to have worked! Yay!

I'm still tweaking my parameters, but this was very, very helpful! THANKS!!!

Laura