Skip to main content

Messages

This section allows you to view all Messages made by this member. Note that you can only see Messages made in areas you currently have access to.

Messages - LauraShireman

1
XCMS / Remove a sample after xcmsSet step
I should have been more careful, but I accidentally included a file that failed to inject when I ran the xcmsSet command. When I tried to perform retcor() later in the script, my failed injection is mucking things up because there are too few peak groups. Is there some way to remove a specific sample from an xcmsSet() object and keep all the others?
2
XCMS / Re: Beginner to MS, how to pick parameters?
Your parameters sound pretty good to me for the most part. A few notable differences between what you're doing and what I usually do:
* I set integrate to 1 instead of 2 in the xcmsSet step.
* I set more of my parameters in the group step. Here is how I typically do it:
Code: [Select]
group(xset_cent20, method="density", bw=4, minfrac=0, minsamp=1, mzwid=0.007, max=100)

One thing I do to make sure that I'm getting peaks that look reasonable is I randomly pick 10 samples and 30 mass features and plot their chromatograms before and after retention time correction and also plot their raw mass spectral data. Here is an example of how I do it (modify this code to suit your needs):
Code: [Select]
# MyData is my peak table.
# Samples are the samples I've processed
# MyData.filledpeaks are the data after the fillpeaks() step

MFs <- as.numeric(sample(row.names(MyData), 30))
RandSamp <- as.numeric(sample(length(Samples), 10))
write.csv(MFs, paste(Sys.Date(), "Randomly selected mass features.csv"))
write.csv(RandSamp, paste(Sys.Date(), "Randomly selected samples.csv"))

EIC.uncorrected <- list()
EIC.corrected <- list()

# This next step will take some time to process, so don't expect instant results.
for (m in MFs){
      EIC.uncorrected[[m]] <- getEIC(MyData.filledpeaks, rt="raw", groupidx=m, sampleidx=RandSamp)
      EIC.corrected[[m]] <- getEIC(MyData.filledpeaks, rt="corrected", groupidx=m, sampleidx=RandSamp) 
}


ColRainbow <- colorRampPalette(c("green", "blue", "purple"))
MyColors <- c(ColRainbow(length(RandSamp)-1), "red")

xset.raw <- xcmsRaw(Samples[RandSamp[10]], profstep=0.01, profmethod="bin")

pdf(paste(Sys.Date(), "MyData EICs and mass spectra of random mass features.pdf"), 8.5,11)

# 1st column shows the uncorrected EICs.
# 2nd column shows the RT-corrected EICs.
# 3rd column shows the m/z vs. RT for the 1st sample for that compound with a
# dashed horizontal line where the calculated m/z is.

par(mfrow=c(4,3), mar=c(3,3,3,0.5))
for(i in 1:30){
      m <- MFs[i]
      plot(EIC.uncorrected[[m]], MyData.filledpeaks, groupidx=1, rtrange=60, col=MyColors, main=MFs[m])
      mtext(paste(i, MyData.peaks$MassFeature[m]), side=3, line=-1, adj=0, padj=0, cex=0.8)
      plot(EIC.corrected[[m]], MyData.filledpeaks, groupidx=1, rtrange=60, col=MyColors)
     
      RT <- MyData.peaks$rt[m]
      RTRange <- c(RT-30, RT+30)
     
      mz <- MyData.peaks$mz[m]
      mzRange <- c(mz-0.02, mz+0.02)
      mzRange.poly.low <- mz- mz*7.5/1e6
      mzRange.poly.up <- mz*7.5/1e6 + mz
     
      plotRaw(xset.raw, mzrange=mzRange, rtrange=RTRange, log=FALSE)
      abline(h=mz, lty=2, col="gray35")
      mtext(paste("abund =", round(MyData.peaks[m, (length(RandSamp))], digits=0)), side=3, line=-1, adj=0, padj=0, cex=0.8)
      polygon(c(RTRange[2], RTRange[1], RTRange[1], RTRange[2]),
              c(mzRange.poly.up, mzRange.poly.up, mzRange.poly.low, mzRange.poly.low),
              col=col2alpha("blue", alpha=0.1), border=NA)
      abline(v=RT, lty=2, col="gray35")
     
}

dev.off()
3
XCMS / Re: Comparing XCMS batch version to XCMSonline
I might be able to help. I've experienced something similar and found that there was filtering going on when I wasn't expecting it in the group() command. If you include your code as well as what parameter settings you used online, I might be able to help.
5
XCMS / Why do I have fewer compounds with more samples?
When I use xcms to process 50 samples, I get ~4000 compounds for one particular dataset. When I use xcms to process 150 samples -- including the original 50 samples -- I get ~1500 compounds. What's going on? Why would more samples result in fewer compounds? This is particularly disconcerting for these data because the aligned data with 50 samples include a compound we're interested in and the aligned data with 150 samples do not.

Here's an example of my code:
Code: [Select]
Samples <- list.files(getwd(), pattern="mzdata.xml", full.names=F, recursive=TRUE)

xs1 <- xcmsSet(Samples[1:50], method = "centWave",  ppm=15, peakwidth=c(4,12),
              snthresh = 5, mzCenterFun="apex", prefilter=c(5,500),
              integrate = 1, fitgauss= TRUE)

xs2 <- xcmsSet(Samples[51:150], method = "centWave",  ppm=15, peakwidth=c(4,12),
              snthresh = 5, mzCenterFun="apex", prefilter=c(5,500),
              integrate = 1, fitgauss= TRUE)

xset.grouped <- group(c(xs1, xs2)), method="density", bw=4,
                          minsamp=1, mzwid=0.007, max=500)

xset.RTcor <- retcor(xset.grouped, method="peakgroups",
                        missing=20, extra=50, smooth="loess",
                        family="symmetric", plottype="none")

xset.grouped2 <- group(xset.RTcor, method="density", minsamp=1,
                          mzwid=0.007, bw=2, max=500)

xset.filledpeaks <- fillPeaks(xset.grouped2)

xset.peaks <- peakTable(xset.filledpeaks, filebase="xset peak table")

If I only align xs1, I get more compounds than if I align both xs1 and xs2.

Thanks for any help!

Laura
7
XCMS / Saving an xcmsSet object
I need to peak pick and align about 250 samples, and because I forgot to turn off Microsoft's automatic updates, the computer I had set up to do this restarted right in the middle of the xcmsSet step, and I lost about a day of computing time. Lesson learned: I turned OFF automatic updates. However, it made me wonder whether there was some way to save an xcmsSet object that's only partway completed. Is that possible? Could I, for example, run xcmsSet on a subset of my samples, save what I've got so far, then run xcmsSet on the next chunk of samples, save again, etc. and at the end, put everything together into one xcmsSet object?

Many thanks in advance for any help! You guys are great!

Laura
9
XCMS / Re: looking for xcms setup help for untargeted metabolomics
Hi, Nat.

Sorry, I should have been checking back here more frequently!

I have just a few thoughts on your issues. First, I wonder if the mass accuracy of your instrument is just too high to work well with XCMS. As someone working in an academic lab, I certainly understand limitations on how nice your instrumentation is, but 400-800 ppm is just awfully high to be doing metabolomics. Most instruments people use for metabolomics would have better mass accuracy than what you're stuck with, and I wonder if there's something in the code for xcms that just doesn't accommodate such a large imprecision in mass measurements. Probably, when Colin Smith, Ralf Tautenhahn, et al. coded xcms, they did so with the mindset that people would be using instruments with relatively good mass accuracy. Have you tried any other metabolomics peak picking and peak alignment software? Some other options: MSInspect SmallMolecule, MetAlign, MZMine. I'm not terribly familiar with them as I've been pretty happy with xcms.

Second, you mentioned that some of your problem peaks are really narrow; I was just rereading Tautenhahn 2008 BMC Bioinformatics, and starting on p. 4 of that paper, they talk about the advantage of fitting chromatographic data with varying widths to the Mexican hat wavelet they describe. This would be the parameter where you set integrate=1 within the xcmsSet command. On p. 8 of that paper, they say, "Optionally, a Gaussian curve is fitted to the feature, using the Nonlinear Least Squares (NLS) implementation of R." That must be gaussfit=TRUE within the same command. Have you tried setting gaussfit=FALSE? I'm not sure how the Gaussian fit works in conjunction with the Mexican hat wavelet function, but maybe they're not playing together nicely in your data. The Mexican hat wavelet sounds as though it IS what you want, though, based on my understanding of this paper and the nature of your data.

Third, your error with mzCenterFun="wMean": Have you tried mzCenterFun="apex"? How close to a Gaussian shape are your peaks? I suspect that the problem lies with your poor mass resolution. I don't know the code, but it probably isn't used to having to include such a large range of m/z to calculate the weighted mean of a single peak. Have you looked at plotRaw to get a visual representation of what your peaks look like on the mass and time axes? Here's an example:
Code: [Select]
xset.raw <- xcmsRaw("filename.mzdata.xml", profstep=0.01, profmethod="bin")
mzRange=c(512.2,512.3)
RTRange=c(705,740)
plotRaw(xset.raw, mzrange=mzRange, rtrange=RTRange, log=FALSE)
When I do this with my data, which was collected on a pretty good QTOF, I can clearly see that at the apex of the peak, I have the best mass accuracy, and at the sides, the mass accuracy is pretty low, which is what you'd expect. What do your data look like? (I'd insert a picture here, but I'm not sure how to do that with this site. It's asking for a URL for the image, and I'm not sure where I'd put it.)

Fourth, you asked about peak insertion errors. What do you mean? I wonder if the problem with the fitgauss parameter comes back to poor mass accuracy again.

I'm so sorry that this has been so frustrating for you! I can certainly relate, although I haven't had the exact problems you describe. Is there any way at all that you might get access to a better instrument?

Good luck. I'll try to be better about checking back here more frequently.

Laura
10
XCMS / Re: looking for xcms setup help for untargeted metabolomics
Hi, Nat S.

I'm not sure what you mean by finding features manually. How many features are we talking about that you would find by hand? I typically find thousands of mass features in my data after processing by XCMS, and the thought of looking for those manually makes me think I'd look for a new job first! Are you sure that XCMS is the right approach for you? If you're doing a targeted analysis where you just have a bunch of compounds that you're looking for, it might be better to just set up a targeted method using the instrument manufacturer's software.

If you're not finding mass features that you expect to find, i.e. mass features that you're certain are real, there are a few places where you could have settings that are off. First, your xcmsSet parameter: It sounds like you have an older instrument. Is it giving you profile data? If so, then the default method for findPeaks, which is what you've got, is probably fine. You might try tightening the step size to something smaller, but I defer to your expertise on how your own instrument performs in terms of resolution.

I don't use the findPeaks default method within xcmsSet, so I'm not terribly familiar with it. How about trying a different algorithm? Have you tried centWave? You could set it up like this:
Code: [Select]
xcmsSet(Samples, method = "centWave",  snthresh = 3, ppm=1000, peakwidth=c(2,20), mzCenterFun="wMean",integrate = 1,fitgauss= TRUE)
That's a pretty wide chromatographic window, and the gaussian fit is somewhat flexible, to my understanding.

For the grouping question, I prefer at this point in my data analysis to leave out any filtering by how frequently a mass feature is present. Instead, I filter later, once I've already got my xcms dataset and I'm ready to do some statistical analyses. An example of how I run the group algorithm where "Data.RTcor" is my RT-corrected dataset:
Code: [Select]
group(Data.RTcor, method="density", minsamp=1, mzwid=0.004, bw=10, max=100)
One last thing you didn't address: I ALWAYS do recursive peak filling. There have been times that recursion has found peaks that xcms missed the first time around. The code is really straightforward:
Code: [Select]
fillpeaks(Data)

Good luck!

Laura Shireman
11
METLIN / Re: MS/MS Spectrum Match is available now
Hi, Kevin.

A question: To see the exact m/z of a peak in an MS/MS spectrum from a METLIN hit, I need to MouseOver that peak. Is it possible to get the m/z and intensities in a csv or other numerical format? I want to compare the peaks I see in the database side by side with the peaks in my spectrum, and it would be much faster if I could just download the numbers rather than mousing over each and writing down the value.

Thanks!

Laura
12
XCMS / Re: XCMS2: collect() doesn't work
Oh, no! How disappointing! That's really too bad. I've been searching METLIN online, but I'm basically copying and pasting by hand, which is tedious, and I was also very excited about the similarity search feature mentioned in the paper describing XCMS2: Benton, H. P., D. M. Wong, S. A. Trauger and G. Siuzdak (2008). "XCMS2: processing tandem mass spectrometry data for metabolite identification and structural characterization." Analytical Chemistry 80(16): 6382-9.

Thank you, as usual, for being so helpful, Jan!

Laura
13
XCMS / Re: XCMS2: collect() doesn't work
Oops! That was an oversight. Fixed the bit of the code so that now I'm using collect(Data.fragments). I still get an error, though:
Error in class(xs) : 'xs' is missing

Thanks, Jan.

Does anyone know of an example with the R code for using XCMS2?

Laura
14
XCMS / Re: XCMS2: collect() doesn't work
Thank you, Jan. The reason I was trying the xcmsRaw() function first was because I was following the steps in this video: http://www.youtube.com/watch?v=eNaKGyyfjT0
Is there another example somewhere that you would recommend? I still don't really understand how to begin, and I'm looking for an example to work from. I tried this:
Code: [Select]
Sample <- list.files(getwd(), pattern="mzdata.xml", full.names=TRUE, recursive=TRUE)

Data.set <- xcmsSet(Sample)

Data.fragments <- xcmsFragments(Data.set)

Data.coll <- collect(Data.set, rt=30)

but that gave me the same error as before when I got to the collect() step.

Laura
15
XCMS / XCMS2: collect() doesn't work
I'm trying to use XCMS2 to search METLIN for MS/MS spectra, and I'm stuck at one of the first steps. When I try the collect() function, I get this error:
Code: [Select]
Error in function (classes, fdef, mtable)  : 
  unable to find an inherited method for function "collect", for signature "xcmsRaw"

Here is the code I'm using:
Code: [Select]
library(xcms)

Data.raw <- xcmsRaw(filename="20110724pooledplasma-MSMS-06.mzdata.xml", includeMSn=TRUE)
Data.raw

Data.coll <- collect(Data.raw, rt=30)

Any suggestions? Thanks in advance!

Laura