Show Posts - LauraShireman

This section allows you to view all Messages made by this member. Note that you can only see Messages made in areas you currently have access to.

Messages - LauraShireman

XCMS / Plot multiple EICs from the same sample

June 05, 2012, 04:34:47 PM

I see how to plot the extracted ion chromatograms for a single ion from multiple samples all in the same plot (getEIC and then plot that object), but how do you plot the extracted ion chromatograms for several ions from a single sample all in the same plot?

Laura

CAMERA / Plot all EICs of a single isotope group

June 05, 2012, 04:28:58 PM

After finding isotopes, is it possible to pick an isotope group and plot the EICs just from that isotope group for a specific sample? I don't want to plot an entire peak group; I would like to see just the EICs for all the ions that CAMERA has assigned to the same isotope group. I thought maybe getIsotopeCluster would be a step in the right direction, but I don't know what to do after that.

On that topic, when getIsotopeCluster or any of the other commands in CAMERA ask for "value" or "intval", what do "maxo", "into" and "intb" refer to?

Thanks!

Laura

XCMS / Re: How do I plot the EIC from an xcmsRaw object?

June 05, 2012, 11:14:42 AM

Thanks, Carsten. That worked perfectly. I was able to use this code to obtain the xcmsRaw objects for my 12 samples and then plot EICs for the m/z and RT ranges that I wanted and the MS plot, too, for whichever of my 12 samples I wanted. In this example, I used sample #2.

Code: [Select]

Samples.raw <- list()
for (i in 1:12) {
  Samples.raw[[i]] <- xcmsRaw(Samples[i], profstep=0, profmethod="bin")
  rm(i)
}
mzRange <- cbind(512.0, 512.4)
RTRange <- cbind(705, 740)

EIC.2 <- getEIC(Samples.raw[[2]], mzrange=mzRange, rtrange=RTRange)
plot(EIC.2, mzrange=mzRange, rtrange=RTRange)
plotRaw(Samples.raw[[2]], mzrange=mzRange, rtrange=RTRange, log=FALSE)

THANKS!

Laura

XCMS / Re: How do I plot the EIC from an xcmsRaw object?

June 04, 2012, 11:02:09 PM

Aw, bummer. In the pdf help file, it lists "scanrange" as one of the parameters, but I see that "scanrange" is missing when I do as you suggest and type ?xcmsRaw in R. I was hoping I could limit the scan range so that I could look at more than one sample at a time. Currently, each xcmsRaw object is so large that I can only load one at a time into the working memory of my PC.

Thanks for the response, Carsten!

XCMS / Re: How do I plot the EIC from an xcmsRaw object?

June 01, 2012, 03:17:48 PM

Ok, one last question on a similar topic: How do I limit the scanrange when creating an xcmsRaw object? From what I've read, it seems to me that this code should work:

Code: [Select]

xset.raw.08 <- xcmsRaw(Sample08, profstep=0.01, profmethod="bin", scanrange=c(1100,1150))

but when I run that, I get an error saying, "unused argument(s) (scanrange = c(1100,1150))".

I tried cbind just in case the format was supposed to be a matrix, but that didn't help. And I don't think it is supposed to be a matrix anyway.

Laura

XCMS / Re: How do I plot the EIC from an xcmsRaw object?

May 31, 2012, 12:30:47 PM

THANKS, Carsten! That worked beautifully!

Here's the complete code I used in case anyone else had trouble with this:

Code: [Select]

MzRange <- cbind(512.0, 512.4)
RTRange <- cbind(705, 740)
QC1.raw <- xcmsRaw("QC1.mzdata.xml", profstep=0.01, profmethod="bin")
QC1.eic <- getEIC(QC1.raw, mzrange=MzRange, rtrange=RTRange)
plot(QC1.eic, mzrange=MzRange, rtrange=RTRange)

XCMS / How do I plot the EIC from an xcmsRaw object?

May 31, 2012, 12:08:03 PM

Thanks to some helpful advice earlier, I can plot the EICs of mass features from an xcmsSet object, but I'm having trouble getting it to work for an xcmsRaw object when there isn't a groupidx to call. I want to look at one sample, QC1, and I want to see the extracted-ion chromatogram for 512.27 m/z from 705 to 740 seconds. This is what I've tried:

Code: [Select]

QC1.raw <- xcmsRaw("QC1.mzdata.xml", profstep=0.01, profmethod="bin")
QC1.eic <- getEIC(QC1.raw, mzrange=c(512.0, 512.4), rtrange=c(705,740))
plot(QC1.eic, mzrange=c(512.0, 512.4), rtrange=c(705,740))

When I try the "getEIC" command, I get this error:
Error in vector("list", nrow(rtrange)) : invalid 'length' argument

There may also be an error in the plot command, but I haven't been able to get past getEIC, so I'm not sure whether the plot command will work the way that I want.

Ultimately, what I want to do is use my own data to get graphs like those in figure 1 from "Highly sensitive feature detection for high resolution LC/MS" by Ralf Tautenhahn, Christoph Böttcher and Steffen Neumann, 2008, BMC Bioinformatics. I can get the plotRaw function to work, which gives me the upper of the two graphs in figure 1, but I can't figure out how to make the lower graph. The help file for XCMS says that mzrange and rtrange should each be a two-column matrix, but I don't really understand what that means. I mean, for the retention time, for example, I just want to look at 705 to 740 seconds, so that would be one row. What would the other rows in the matrix be?

Thanks!

Laura

CAMERA / Re: Way too many ions being assigned to the same compound

May 31, 2012, 11:38:53 AM

Thank you very much, Jan and Carsten. I was planning to use other tools for statistical analyses; your answers helped clarify the intent of CAMERA for me, though, and that was very helpful.

Thanks!

Laura

CAMERA / Re: Way too many ions being assigned to the same compound

May 30, 2012, 03:21:10 PM

Hi, Jan.

Ah, yes, I was misunderstanding how it works. That helps!

Mind if I ask a personal preference question, then? When you are first setting out to analyze your data and you have a list of mass features and their intensities from XCMS, do you try to do anything to assign which ions might come from the same compound before doing any statistics on your data set? Or do you use the output from difreport or peakTable as is, figure out which ions are the most statistically significant for your research question and then use CAMERA solely to start structure elucidation?

Thank you very much for all your help!

Laura

CAMERA / Re: Way too many ions being assigned to the same compound

May 30, 2012, 01:44:32 PM

Thank you very much for your response! That clarified a lot! The software that I was using previously was Agilent's MassHunter Qualitative Analysis, and while its algorithm is proprietary and black-box, my understanding of how it works is that anytime it can reasonably assign an isotopic peak or an adduct peak to something, then and only then it will count those as the same compound. In other words, it assumes that co-eluting ions arise because of different compounds unless it has reason to believe they're the same. But what you're telling me, then, is that CAMERA works with the opposite assumption: CAMERA assumes that co-eluting ions are the same compound unless it has a reason to think that they are not. Wouldn't it be better the other way from a statistical perspective? If the burden of proof is to show that ions are caused by the same compound, then you're probably going to miss some ions that really are caused by the same compound and classify them as different. When you do statistical testing, then, the compounds will not be completely independent. On the other hand, if the burden of proof is to show that ions are caused by different compounds, then you'll sometimes mistakenly assign ions arising from multiple compounds as belonging to just one compound. If that happens, unless you're ridiculously lucky (or maybe unlucky), you'd probably have issues with false negatives because some compounds in that peak group might correlate with what you want and many would not. You'd increase the "noise" of your data a lot by misassignment of peak groups.

I see what you're saying about correlating across samples, and that makes sense to me if you're comparing two groups and calcCiS looks within one group at a time. Is that how it works? I mean, if you had some compound that was interesting because it's high in group 1 and low in group 2, what does CAMERA do with that information when it's calculating correlations across samples? And what about situations where you're not comparing two groups? In my research, I'm trying to find compounds that correlate with a separate measurement from the same subjects. I don't have multiple groups; I'm looking for what compound correlates linearly with this separately determined measurement. I expect that compounds that wind up being interesting to us will never have the same intensity across samples.

Laura

XCMS / Re: Undesired filtering somewhere

May 30, 2012, 12:22:38 PM

Thanks, Carsten! That helped!

Laura

CAMERA / Way too many ions being assigned to the same compound

May 30, 2012, 12:17:56 PM

I'm trying to use CAMERA to annotate peaks from data collected in ESI- on a QToF in single-MS mode. I started off using the parameters listed in "LC-MS Peak Identification and Annotation with CAMERA" by Carsten Kuhl, Ralf Tautenhahn and Steffen Neumann, and then, when those parameters gave the result that many, many ions were all caused by the same compound, I tried adjusting. I've tried making my parameters stricter and stricter, and I've now got parameters that suggest that we've got the world's most amazingly mass- and retention-time accurate QToF, but I'm still coming up with the same number of grouped features every time. For example, a bunch of stuff co-elutes around 12.5 minutes, and CAMERA has put 137 ions into that pcgroup, and I just can't believe that one compound could really generate 137 ions. What am I doing wrong? Am I misunderstanding the output? I thought that CAMERA would take a peak-picked, peak-aligned and peak-filled XCMS object and determine which of all those mass features were caused by the same compound. For example, let's say that two compounds co-elute and each generates one Na adduct and one 13C peak in addition to their major, monoisotopic peak. Doesn't CAMERA then decipher those data and tell you, "Hey, it looks like you've got two different compounds that co-elute and these three peaks are because of compound A and those three peaks are because of compound B."? Wouldn't those two compounds have a different number listed under pcgroup?

Here's the code I'm using, in case that's illuminating.

Code: [Select]

Set1.annot <- xsAnnotate(Set1.filledpeaks)
Set1.F <- groupFWHM(Set1.annot, perfwhm=0.005)
Set1.C <- groupCorr(Set1.F, cor_eic_th=1.6, pval=0.001, calcIso=TRUE, 
                      calcCiS=TRUE)
Set1.FI <- findIsotopes(Set1.C, maxcharge=3, maxiso=4, ppm=10, 
                          mzabs=0.0001, intval="maxo", minfrac=0.1)
Set1.FA <- findAdducts(Set1.FI, ppm=10, mzabs=0.0001, multiplier=3, 
                         polarity="negative", rules=NULL, max_peaks=100)
Set1.peaklist <- getPeaklist(Set1.FA)
write.csv(Set1.peaklist, file="Set1 annotated peaklist.csv")

By the way, what are the allowable numbers for cor_eic_th? Is that referring to equation 1 in Kuhl 20012 Analytical Chemistry? So can that parameter range from 0 to 3?

Thank you very much in advance!

Laura

XCMS / Re: Undesired filtering somewhere

May 26, 2012, 12:27:58 AM

Hi, Ralf.

These are QToF data. I picked the value of mzwid=0.004 because, for peaks that appear in multiple samples, the difference in m/z between a peak detected in one sample and the same peak detected in another sample by our instrument is generally slightly less than 0.004 m/z. I saw in Table 1 of your, Gary Patti and Gary Siuzdak's Nature Protocols paper from February that you recommend using mzwid=0.025 for HPLC/QToF data or mzwid=0.015 for high-resolution HPLC/QToF data and that you recommended ppm=30 and ppm=15, respectively. How did you determine those numbers? We're scanning from 100-1000 m/z in our runs, and the mean m/z we detect in plasma and urine samples is ~400 m/z. A difference between samples of 0.004 m/z for a molecule with m/z=400 is 10 ppm, so shouldn't I set mzwid=0.004 and ppm=10? But maybe you aren't determining those parameters the way that I'm thinking because a difference in m/z of 0.025 for a common small molecule would be ~60 ppm, not 30 ppm.

Thanks for any clarification!

Laura

XCMS / Re: Undesired filtering somewhere

May 20, 2012, 05:05:32 PM

Great! Thanks, Ralf; that worked! I changed the group function command, adding the minfrac parameter like this:

Code: [Select]

group(U5g.raw, method="density", minsamp=1, minfrac=0, mzwid=0.004, bw=10, max=10)

and that worked! Now, I've got 13,541 mass features.

That was the missing piece. I wasn't sure which would overrule which, the minfrac or the minsamp parameter. It appears that XCMS uses whichever filtering level is higher.

Thank you for your help!

Laura

XCMS / Undesired filtering somewhere

May 16, 2012, 10:55:07 PM

For metabolomics data, the biostatisticians with whom we collaborate strongly discourage any kind of filtering of mass features based on group membership. For example, they say that it would be a bad idea to detect peaks and then, during peak alignment, only keep mass features that were present in 50% of the samples in treatment group A or 50% of the samples in treatment group B. Better, they say, to keep mass features present in 50% of all samples so that your data preprocessing steps do not bias your outcome. With that in mind, I'm trying to make sure that XCMS is not filtering at all. Here's my issue: I get different results when I group my samples -- putting my raw data files into different folders -- than when I do not, so somewhere, I'm filtering when I don't mean to be. Here is the code I'm using:

Code: [Select]

Samples <- list.files(getwd(), pattern="mzdata.xml", full.names=FALSE, recursive=TRUE)

U5g.raw <- xcmsSet(Samples, method = "centWave", snthresh = 10, ppm=15, peakwidth=c(6,12), mzCenterFun="apex", integrate = 1,fitgauss= TRUE)

U5g.raw

U5g.grouped <- group(U5g.raw, method="density", minsamp=1, mzwid=0.004, bw=10, max=10)

U5g.RTcor <- retcor(U5g.grouped, missing=15, extra=30, smooth="loess", family="symmetric", plottype="mdevden")

U5g.grouped2 <- group(U5g.RTcor, method="density", minsamp=1, mzwid=0.004, bw=10, max=10)

U5g.filledpeaks <- fillPeaks(U5g.grouped2)

U5g.peaks <- peakTable(U5g.filledpeaks, filebase="ESI+ urine 5g peak table")

When I have my samples in 4 folders, one each for the four treatment groups I've got, I get 12,793 mass features. (Yes, I know that many of those are noise and that I'm probably too stringent on some of my mass spectral resolution parameters. :-) I'll adjust that later, once I better understand what's going on here.) When I put those exact same data files all together into one folder, I get 3,806 mass features.

Anyone have any thoughts on what's going on? I thought that if I put "minsamp=1" for a grouping parameter that meant that I wasn't filtering at all, but if I'm not filtering based on group membership, why do I get a different number of mass features when I group my samples by treatment group than when I don't?

Thanks in advance. This board has really, really been helpful to me in the past!

Laura