Skip to main content

Messages

This section allows you to view all Messages made by this member. Note that you can only see Messages made in areas you currently have access to.

Messages - Carsten

16
XCMS / Re: peak shape/symmetry?
Hi Corey,
if you use findPeaks with verbose.columns=TRUE and fitgauss=TRUE,
then the peaktable contains additional columns with information about the wavelet analysis
and the Gaussian parameter mu, sigma and h.

Carsten
17
XCMS / Re: help interpreting result
The x-axis is the retention time in seconds, the title shows the m/z value and N is the number of peaks that falls into that m/z bin.
Quote from: "osuct"
How can I tell "if the huge difference in rt means that on the same mz slide, two different peaks occurs within a short time" ?
The output of groups shows all information about the specific feature.
Quote
mzmed    mzmin    mzmax      rtmed    rtmin    rtmax    npeaks samples
90.52499  90.50587  90.53768 566.7325 493.651 571.501    602    434
So this feature contains 602 peaks from 434 samples, with mz values between 90.50587 and 90.53768 and the peaks occur between 493 and 571 seconds.

If you look at the figures and see you have two vertical line of points at 493 seconds and 571 seconds within one smooth gaussian function,
then the bw parameter was too high.
The cluster at ~580 shows no much deviation, which should be normal for an UPLC,
because the retention times are very stable.

As you also see in the second plot, the kernel widths are much smaller, which should results into more feature groups.
Normally you should also see coloured, dotted vertical lines, which indicates the identified groups and helps you to interpret the results.
So I assume you have no features with the mass of 81.52?

Another parameter you could optimize is the mzwid parameter, which is the width of those m/z slides.
The default 0.25 m/z is quite huge for an QTOF.
18
XCMS / Re: help interpreting result
Quote from: "osuct"
It looks like the program identified 19 peak groups. That means there are 19 analytes identified across multiple samples?
The first analyte is eluting at 566.7325 retention time (median) which has 602 peaks and it appears in 434 samples?
The group function is an alignment function,
which matches Peak X from Sample A to its corresponding Peak X in Sample B and so on and put the corresponding peaks into one feature group.
For the underlying method please check the xcms paper.

The xset@groups output shows you an overview about all detected features,
which are arrays defined by a m/z range and a retention time range.
The "npeaks" column is the sum of all peaks that falls into that ranges over all samples.
The "samples" column is the number of samples, where one or more peaks appears in that specific range. That is also the reason,
why npeaks can be higher than the number of samples.

At this point of the analysis I would recommend to optimize your parameters. See ?group.density for a short description.
Because the retention time difference for your first feature is quite huge, if you compare it to the second feature.
The standard bw = 30 parameter is for a HPLC setup, so for your UPLC a good starting point would be bw = 10.

You could also set sleep = 5 (5 seconds per feature), which produces for each feature a nice figure, where you see on overview about the detected feature and for example if the huge difference in rt means
that on the same mz slide two different peaks occurs within a short time.

Carsten
19
XCMS / Re: how to pick peaks on a selected RT range ?
Until now only the scanrange can be used, so you need to convert.
If you want to check the selected scanrange you can access the scan times with:
Code: [Select]
#get scan time for scanrange = c(100,150)
xr <- xcmsRaw(filename)
xr@scantime[c(100,150)]
21
XCMS / Re: Plot multiple EICs from the same sample
This is somehow similar to the CAMERA plotEICs function.
As far as I know you can't collapse all plots directly from the xcms plotEIC into one plot.
One solution could be to use the layout function, as Ralf suggested in another thread.
So multiple plots on one page. This should work with the normal plotEIC. Perhaps sleep > 0 not sure.

If you really want all EICs in one plot! (be aware of differences of retention time and intensity), try this snippet.
Is an adoption of the last example and the plot quality could perhaps be not optimal  ;)

Code: [Select]
#load libraries
library(xcms)
library(faahKO)

#normal pre-processing
xs.grp  <- group(faahko)
xs.ret  <- retcor(xs.grp)
xs.grp2  <- group(xs.ret)
xs.fill <- fillPeaks(xs.grp2)

#for example: peak 1 and 2 from sample 1
#also sampleidx="ko15" is okay
xeic.raw <- getEIC(xs.fill, rt = "raw", groupidx= c(1,2),sampleidx=1)

#get retention time ranges
rt <- xeic.raw@rtrange
rt.min <- min(rt[,"rtmin"])
rt.max <- max(rt[,"rtmax"])

#get mean mzranges (for legend)
mzrange <- apply(xeic.raw@mzrange,1,mean)
#get max. intensities
maxint <- sapply(xeic.raw@eic[[1]], function(x) max(x[,"intensity"]))

#generate plot
plot(0, 0, type = "n", xlim = c(rt.min,rt.max), ylim = c(0, max(maxint)),
    xaxs = "i", xlab = "Retention Time",
    ylab = "Intensity", main = paste("Extracted Ion Chromatograms for ",
                                      "nTime: From", round(rt.min,3), "to", round(rt.max,3)))

#make nice colors, change to number of peaks plotted
col <- c("red","blue")
for(i in seq(along=xeic.raw@eic[[1]])){
  points(xeic.raw@eic[[1]][[i]], type="l", col=col[i])
}
#make legend
legend("topright",col=col,legend=mzrange,lty=1)

Carsten
22
CAMERA / Re: Plot all EICs of a single isotope group
Hi Laura,

good question, because the output of getIsotopeCluster can't be mapped directly.
It would be a larger code snippet, so I prefer a small change in the function itself.
I report back, as soon as I'm finished.

Concerning the value argument:
xcms reports 3 different intensity values for each peak.
maxo - maximum peak intensity
into - integrated peak intensity
intb - integrated peak intensity (baseline corrected)

The choose of the intensity is important for the detection in findIsotopes (for calculation C12/C13 threshold)
Within getIsotopeCluster it only changes the reported intensity value.

Within our data sets the intb values works best, mainly with low peak intensity.

Best,
Carsten
23
XCMS / Re: How do I plot the EIC from an xcmsRaw object?
Quote from: "LauraShireman"
Aw, bummer. In the pdf help file, it lists "scanrange" as one of the parameters, but I see that "scanrange" is missing when I do as you suggest and type ?xcmsRaw in R. I was hoping I could limit the scan range so that I could look at more than one sample at a time. Currently, each xcmsRaw object is so large that I can only load one at a time into the working memory of my PC.

The xcmsRaw object is to large? Hmm, you could try to set profstep=0. This way no profile matrix is generated.
That should save some bytes and centWave works perfect without it.
But every function which depends on the profile matrix certainly not!

Other way could be to split the xcmsRaw file, depending on your setup, for example if the file contains MS2.
Could you provide me with some more details, if the above doesn't help?
 
Carsten
24
XCMS / Re: How do I plot the EIC from an xcmsRaw object?
Hi Laura,
this doesn't work, because scanrange is a not a parameter for the xcmsRaw constructor. ("Unused argument" means the argument didn't exists in the function definition, see ?xcmsRaw)

In general, you read the complete sample and the subsequent functions
like getEIC or findpeaks.centWave can use a subset.

Carsten
25
XCMS / Re: How do I plot the EIC from an xcmsRaw object?
Hi Laura,

not 100% sure and I'm not in the office to check, but I think the getEIC functions needs a cbind.
Code: [Select]
QC1.eic <- getEIC(QC1.raw, mzrange=cbind(512.0, 512.4), rtrange=cbind(705,740))
That should work.

Carsten
26
CAMERA / Re: Way too many ions being assigned to the same compound
Hi Laura,

as Jan already pointed out, CAMERA uses multiple informations to decide whether peaks
within a short retention time window originate from different co-elution or from the same substance. Those peaks can be adducts, clusters, isotopes and fragments.
For example, in our QToF system we observe a lot of in-source fragments.

If you have only a single sample experiment, as in your case, you can only use correlation based on peak shape similarity (short: groupCiS). 
The groupCorr function, which is a wrapper function for all underlying grouping functions, automatically recognize this.

So in short only those compounds stay together, which shares a high peak shape similarity.
But their can be the case, as Jan mentioned, that two compounds have a perfect correlation. I just added one example from our data.
Here we have 2 substances (red, blue) we shares perfect co-elution, even from the peak shape.
[attachment=0:1oaw7qxf]Bsp5.png[/attachment:1oaw7qxf]
But in that case we were lucky and CAMERA was able to annotate both to two different pseudo-molecular ion groups afterwards.

If you would go directly only to annotated peaks, then it could happen that important peaks are sorted out.
For example we have a high abundance fragment peak with different adducts, like [F+H]+ and [F+Na]+ , but only a small [M+H]+ with no isotope and adducts. If the mass difference between M and F is
not into your rule set, both would be separated. But the peak shape analysis suggests a correlation between both.

So we think that high correlation is mandatory and adduct annotation helps in further interpretation.

Cheers,
Carsten

[attachment deleted by admin]
27
XCMS / Re: Undesired filtering somewhere
Hi Laura,

the centWave algorithm searches in short for m/z signals occurring in consecutive scans within a specific m/z error.
The min/max number of necessary scans are calculated from the peakwidth parameter.
The m/z error is the combination of the mzwid and ppm parameters.

At the peak apex the m/z error is certainly within your mentioned 10ppm ranges, but at the peak borders,
with are normally at low intensities, the mass accuracy is worse. This also applies to low abundance peaks.
That is the reason for choosing higher ppm values.

To get an impression on our data, you can look at the @peaks slot or the general peak list, where for each peak beside the mz values (which is calculated at the peak apex)
also the mzmin, mzmax values are reported.
 
Cheers,

Carsten
28
XCMS / Re: Not able to read a write.cdf() written NetCDF file
As far as I know, the cdf object written by write.cdf() can be read again only by xcms itself.
Other programs like AMDIS or OpenMS fails, because they have additional requirements for the generated cdf.

I'm not sure, if there was some progress in the xcms development, since the last time I checked.
29
XCMS / Re: centWave warning of not centroid mode data
Quote from: "bowenli37"
In .local(object, ...) :
It looks like this file is in profile mode. centWave can process only centroid mode data !

This it is just a warning from a heuristic function to detect profile data.
It can be ignored if your samples are in centroid mode.

Quote
No peak groups found for retention time correction

This can happen with very large sample set, although I would expect that 2990 features should be enough to find at least one.
Are the group parameters (defaults) suitable for your setup?

You could also try the orbiwarp method,
Code: [Select]
retcor(method="obiwarp")
which does not require specific peak groups.

Carsten
30
XCMS / Re: Vicinity elimination postprocessing
I think this applies to the older "matchedFilter" algorithm.
Within "centWave" this value is much lower and should occur very rare.
Have a look at the manpage with:
Code: [Select]
?findPeaks.centWave

The argument mzdiff is what you are looking for.
But I'm not 100% sure, if the default -0.001 means that no peak is removed.
I will check it.

Carsten