When I use xcms to process 50 samples, I get ~4000 compounds for one particular dataset. When I use xcms to process 150 samples -- including the original 50 samples -- I get ~1500 compounds. What's going on? Why would more samples result in fewer compounds? This is particularly disconcerting for these data because the aligned data with 50 samples include a compound we're interested in and the aligned data with 150 samples do not.
Here's an example of my code:
Samples <- list.files(getwd(), pattern="mzdata.xml", full.names=F, recursive=TRUE)
xs1 <- xcmsSet(Samples[1:50], method = "centWave", ppm=15, peakwidth=c(4,12),
snthresh = 5, mzCenterFun="apex", prefilter=c(5,500),
integrate = 1, fitgauss= TRUE)
xs2 <- xcmsSet(Samples[51:150], method = "centWave", ppm=15, peakwidth=c(4,12),
snthresh = 5, mzCenterFun="apex", prefilter=c(5,500),
integrate = 1, fitgauss= TRUE)
xset.grouped <- group(c(xs1, xs2)), method="density", bw=4,
minsamp=1, mzwid=0.007, max=500)
xset.RTcor <- retcor(xset.grouped, method="peakgroups",
missing=20, extra=50, smooth="loess",
family="symmetric", plottype="none")
xset.grouped2 <- group(xset.RTcor, method="density", minsamp=1,
mzwid=0.007, bw=2, max=500)
xset.filledpeaks <- fillPeaks(xset.grouped2)
xset.peaks <- peakTable(xset.filledpeaks, filebase="xset peak table")
If I only align xs1, I get more compounds than if I align both xs1 and xs2.
Thanks for any help!
Laura