Skip to main content

Show Posts

This section allows you to view all Show Posts made by this member. Note that you can only see Show Posts made in areas you currently have access to.

Messages - Carsten

1
XCMS / Re: group or peakTable problem
I also observe that group.nearest produces more feature groups, but more doesn't mean "better"
But I haven't had the time to really compare both methods, so normally I stick to density, because I'm more familiar with it.

I spoke with Ralf about the group.density problem and we removed the threshold in the next xcms version,
because it's unnecessary.

That should also fix your problem.

Carsten
2
XCMS / Re: group or peakTable problem
After some debugging I think I have an idea, where the problem lies.
The group.density function bins all m/z values (meaning all m/z values over the whole retention time) and then loops over these bins.
It defines a number of peaks which has to occur in that bin, otherwise it loops into the next.
The value is defined as the minimum number of samples per class divided per two.
So for the faahKO data set this number is 3, because we have 6 samples per class.

If the number of peaks within that bin is higher than the threshold, the density function divides the m/z bin into different retention time groups.
For these groups the minfrac and minsamp values are then checked.

I have no idea why this threshold exist and if it can be safely removed.

Carsten
3
XCMS / Re: group or peakTable problem
Hmm, really strange, my first idea would also go into the direction of setting minfrac and minsamp to 0.
Could be a very rare bug  :?:

Perhaps you could try the other grouping method group.nearest and look if it performs better.

Cheers,
Carsten
4
Other / Re: Binning software
Quote from: "metaphase"
Hello to everyone!

What software/R package or script are you using to bin (bucket) data from LC-MS?

Feeding an xcmsRaw with a txt file could be a painful approach. If you only want to bin values have a look at the R function cut.
Here is a small example, which you could use as a starting point

Code: [Select]
#Generate some data
x <- rnorm(1000,0,10)
#sort data
x.sort <- sort(x)
#take a quick look
head(x.sort)
#[1] -31.44805 -29.43685 -28.45767 -28.18506 -26.27450 -25.92044
#generate bins with bin size of 0.1 over whole data range
bin <- seq(min(x)-0.1,max(x)+0.1,by=0.1)
#take a look at the bins
head(bin)
#[1] -31.54805 -31.44805 -31.34805 -31.24805 -31.14805 -31.04805
#So first bin starts at -31.54805 to -31.44805
binned.x=cut(x, breaks=bin, labels=FALSE)
head(binned.x)
#[1]  1 22 31 34 53 57
#As an results you retrieve bin indes, so first x values goes into bin of -31.54805 to -31.44805

Cheers,
Carsten
5
CAMERA / Re: how to plot EICs for multiple samples overlayed ?
It exists a method for plotting EICs from a xcmsSet.
You can use this methods for your problem, because the original xcmsSet object is stored within a xsAnnotate object.

So with xs <- object@xcmsSet (object is here the processed xsAnnotate object)
you can retrieve the original xcmsSet object.

Here with an example
Code: [Select]
library(xcms)
library(faahKO)
cdfpath <- system.file("cdf", package = "faahKO")
cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE)
xset <- xcmsSet(cdffiles)
xset <- group(xset)

xsa <- annotate(xset)
##Only for the case the xset is lost.
xset <- xsa@xcmsSet

##For plotting first fillPeaks your xcmsSet
xset.fill <- fillPeaks(xset)
##Get EICs
xeic <- getEIC(xset.fill, groupidx=1:nrow(xsa@groupInfo))
##Plot EICs for the first peak
plot(xeic, groupidx=1)
## For more information about this function see ?xcms::plot.xcmsEIC

Cheers,
Carsten
6
CAMERA / Re: how to plot EICs for multiple samples overlayed ?
Hi Tanus,

CAMERA has the plotEICs function, which plots the peak EICs for each peak group created by CAMERA.
But you want to plot for one peak the EICs from all its samples in one plot.
Did I understand you right?

Carsten
7
XCMS / Re: xcmsSet constructor and mslevel
Hey zeleniy,

findPeaks only works on xcmsRaw objects.

The arguments section in the documentation contains a mistake,
thanks for pointing that out.
In the methods section it's describe correctly.
Code: [Select]
Methods:
    object = "xcmsRaw"

By the way, if you use the xcmsSet constructor, your data is already processed by the default peak picker.
If you want to use the "centWave" peak picker, then change your code to

Code: [Select]
xset <- xcmsSet(
  files,
  snames = NULL,
  sclass = NULL,
  phenoData = NULL,
  profmethod = "bin",
  profparam = list(),
  polarity = "positive",
  mslevel = NULL,
  nSlaves = 0,
  progressCallback = NULL,
  method="centWave"
)
8
XCMS / Re: how to discard fillpeaks function
Hi Jc,

so do you need the diffreport function or are you fine with the peaktable generated by the script from Paul?

And for using CAMERA at which step if complains about the NA values?
The only function I could imagine would be annoteDiffreport, which uses the diffreport and therefore were are back at the start of this thread  ;)
But all other functions should work with NA values.

If you don't want to use the wrapper functions,
you find in every manpage some code examples for the "normal" script based analyis.
If you need help within any function, just write me a note.
Perhaps in the CAMERA sub forum that so that we not "pollute" this one  :) 

Carsten
9
XCMS / Re: how to discard fillpeaks function
Hi JC,

the diffreport function calculates the fold changes between two samples for every feature.
If in one sample the value is NA (meaning the feature could not be detected or the feature intensity was lower than your filtering settings),
then the fold change can't be calculated.
For those cases the fillPeaks function was written and works quite well.

Carsten
10
XCMS / Re: memory error during fillpeak step
Quote from: "Jan Stanstrup"
You can split you xcmsset: see ?split.xcmsSet
But the peak group info is discarded so it won't be trivial to put it back together again after re-grouping and filling. There is no logical way to divide this data for individual analysis?
I actually tried that yesterday but it turns out to be more tricky then I thought.
You need some adoption in findPeaks or a script which simply redo that step.
I will try again over the weekend and report back!

Carsten
11
XCMS / Re: problem with mzClust
Hi,

only in addition to Steffens post, I would recommend to use centWave for peak detection,
because it works much better on QTOF data.
So change to:
Code: [Select]
xset <- xcmsSet(method="centWave", snthresh=3, nSlaves=4)


So long,
Carsten
12
XCMS / Re: adjust decimal places of titles of EICs
Hi Koen,

I add a new parameter mzdec to the diffreport function, which is the number of decimals (defaults to 2) in the plot title.
Should be build on BioC (version 1.33.17) within the next 24 hours.

Carsten
13
XCMS / Re: help interpreting result
Quote from: "uwe.geppert"
I need help in the Interpretation of XCMS-Results, too.
For my analysis it was not possible to use the same amount of plant tissue in each sample.
Does XCMS compare absolute or relative amounts of substances?

To keep threads clean it's would be best to create a new one instead of posting here,
but unfortunately I have no moderation rights to move your posting, so lets keep it here.

The fold changes within the diffreport function in xcms assumes that all samples uses the same amount.
Paul Benton posted a script to adjust peak intensities to fresh weight in the xcms cookbock
http://http://metabolomics-forum.com/viewtopic.php?f=26&t=143&sid=9bf36eb94e0e4c467b23563e34d0488d

Although we can't guarantee a linear relationship between fresh weight and peak intensity,
so be careful when you check the results.

Carsten
14
XCMS / Re: help interpreting result
Sorry for the late answer, but I was on holiday the last few days. ;)

Quote from: "osuct"
So, points that line up represent peaks from multiple samples that have the same retention time, correct?
Exactly!

Quote from: "osuct"
The y-axis is labeled density. What density is this referring to?
This density is calculated with a "standard" gaussian kernel estimation.
By the way the bw parameter from the group function is the smoothing bandwidth for this kernel.

Quote from: "osuct"
I tried to change the mzwid parameter to 0.025 and got a lot more detected features (> 1000) but is there a diagnostic like the group plot to see whether this is too small or too big?
Not a general one, as far as I know. If I want to optimize my parameters I normally look at some specific features and their group plots,
also in how many cases the npeaks number is higher than the number of samples, ... 

But keep in mind that the mzwid parameter is unfortunately an absolute value, not a relative one.
So mzwid= 0.01 could be optimal for m/z 100 - 400, but for  400 - 1000 you need mzwid=0.025 and therefore it must be set to 0.025 in general.

Quote from: "osuct"
Also, for the following figure, does it mean two features (with different two different rt range) are detected because of the two dashed lines? or is it only 1 feature is detected with larger rt range? Why there is only 1 gaussian curve for these two line of points?
As you can see in you groups result, you have only one feature at m/z 76.95. The dashed lines marks here the left and the right end of the rt region, where the feature is defined.
So according to the feature, the left line is at 212.638 and the right at 223.605
        mzmed    mzmin    mzmax  rtmed  rtmin  rtmax npeaks 01andQC
[5,] 76.95368 76.95315 76.95496 213.623 212.638 223.605    265    239

The width of the gaussian curve depends certainly on the detected peaks but also on the previously mentioned bw parameter.
The kernels are scaled such that the bw parameter is the standard deviation of those kernels.
So if you know that your chromatography is very stable and the retention time for a feature doesn't change much even over hundreds of samples,
then you can decrease the bw parameter. This results in much more narrow curves, which would separate this feature here in two.

Carsten
15
metaXCMS / Re: Peak Data Insertion Problems-Please Try Lowering ppm Par
Hi Stacy,

the centWave algorithm has a two step feature detection framework.
The first step is the detection of connected centroids, which are called ROI (region of interest).
If within one scan the selection of a centroid is not distinct,
this could be caused by a too large search window (ppm parameter).

If you have not many warnings, you can safely ignore them.
If not, then you should lower your ppm parameter.

Carsten