Skip to main content
Topic: Duplicate peaks despite grouping (Read 5083 times) previous topic - next topic

Duplicate peaks despite grouping

I go through the regular XCMS workflow stages: peak identification (centWave) -> retention time correction (OBI-Warp) -> grouping (nearest) -> filling. However, when I use the resulting peak lists, despite the grouping, I end up with a lot of duplicate peaks. This of course gives problems for the differential report, as well as for subsequent multivariate analysis (e.g. PCA).

I've been having this problem for some time now. On a previous data set I switched from density-based grouping to nearest grouping, which seemed to decrease the amount of duplicate peaks to a handful. However, for a new data set I'm working on at the moment, nearest grouping results in hundreds of duplicate peaks as well.
I've tried to merge peaks manually, but some samples have different values for several of the duplicate peaks, so I don't know how to combine these values.

What causes these identical peaks to not be grouped together? Thank you for your help.

This is my relevant code:
Code: [Select]
# peak identification
set <- xcmsSet(files=rawfiles, method="centWave", ppm=30, peakwidth=c(10,60), prefilter=c(0,0), nSlaves=8)

####################################################################################################

sample.names <- sampnames(set)
class.label <- sampclass(set)
for(r in 1:length(sample.names)) {
start <- gregexpr(pattern="_", sample.names[r], fixed=TRUE)[[1]][1] + 1
end <- gregexpr(pattern=".mzdata", sample.names[r], fixed=TRUE)[[1]][1] - 1
sample.names[r] <- substr(sample.names[r], start, end)
}
sampnames(set) <- sample.names

####################################################################################################

# RT correction
corset <- set
pdf(paste(out.folder, "rt-cor.pdf", sep="/"))
corset <- retcor(corset, method="obiwarp", plottype="deviation", response=10, profStep=0.1, distFunc="cor_opt", gapInit=0.3, gapExtend=2.4)
dev.off()

# group corresponding peaks across samples
corset <- group(corset, method="nearest")

# fill missing peak values
fset <- fillPeaks(corset)

Session info:
Code: [Select]
> sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-apple-darwin13.2.0 (64-bit)

locale:
[1] C

attached base packages:
[1] parallel  stats    graphics  grDevices utils    datasets  methods  base   

other attached packages:
[1] gplots_2.14.1      xcms_1.41.0        Biobase_2.25.0      BiocGenerics_0.11.5 mzR_1.11.11        Rcpp_0.11.2       

loaded via a namespace (and not attached):
[1] KernSmooth_2.23-13 bitops_1.0-6      caTools_1.17.1    codetools_0.2-9    gdata_2.13.3      gtools_3.4.1      tools_3.1.0 

 

Re: Duplicate peaks despite grouping

Reply #1
What does your plots from retcor look like? Does the correction work OK? Which setup? HPLC? UPLC? Runtime? Q-TOF or something else? What parameters did use with group when using density? Example of peaks that are the same but split?

I have written a function that might help you debug. It is called analyze.xcms.group and can be found in this package: https://github.com/stanstrup/chemhelper. For the moment it is undocumented. Also the package has an unfortunate number of dependencies so it is quite bothering to install. Sorry about that.


Code: [Select]
library(faahKO)
filepath <- system.file("cdf", package = "faahKO")
xsg <- group(faahko)
xsg <- fillPeaks(xsg)
analyze.xcms.group(xsg,mz=256.1500,rt=3451.305,rt_tol_sample=300,mz_tol_sample=0.01,rt_tol_group=300,mz_tol_group=0.05)

mz and rt correspond approximately to the peak of interest. rt_tol_sample and mz_tol_sample are tolerances for finding related peaks across all samples. rt_tol_group and mz_tol_group are tolerances to find related peak groups.

Output plot from above example:
[attachment=0:3h5rjt98]Rplot.png[/attachment:3h5rjt98]

The rectangles show how the peaks from each sample was grouped. It might help you understand if there is no clear separation and if your issue is in the rt or mz dimension.

[attachment deleted by admin]
Blog: stanstrup.github.io