1
Messages
This section allows you to view all Messages made by this member. Note that you can only see Messages made in areas you currently have access to.
Messages - LauraShireman
2
XCMS / Re: Beginner to MS, how to pick parameters?
* I set integrate to 1 instead of 2 in the xcmsSet step.
* I set more of my parameters in the group step. Here is how I typically do it:
Code: [Select]
group(xset_cent20, method="density", bw=4, minfrac=0, minsamp=1, mzwid=0.007, max=100)
One thing I do to make sure that I'm getting peaks that look reasonable is I randomly pick 10 samples and 30 mass features and plot their chromatograms before and after retention time correction and also plot their raw mass spectral data. Here is an example of how I do it (modify this code to suit your needs):
Code: [Select]
# MyData is my peak table.
# Samples are the samples I've processed
# MyData.filledpeaks are the data after the fillpeaks() step
MFs <- as.numeric(sample(row.names(MyData), 30))
RandSamp <- as.numeric(sample(length(Samples), 10))
write.csv(MFs, paste(Sys.Date(), "Randomly selected mass features.csv"))
write.csv(RandSamp, paste(Sys.Date(), "Randomly selected samples.csv"))
EIC.uncorrected <- list()
EIC.corrected <- list()
# This next step will take some time to process, so don't expect instant results.
for (m in MFs){
EIC.uncorrected[[m]] <- getEIC(MyData.filledpeaks, rt="raw", groupidx=m, sampleidx=RandSamp)
EIC.corrected[[m]] <- getEIC(MyData.filledpeaks, rt="corrected", groupidx=m, sampleidx=RandSamp)
}
ColRainbow <- colorRampPalette(c("green", "blue", "purple"))
MyColors <- c(ColRainbow(length(RandSamp)-1), "red")
xset.raw <- xcmsRaw(Samples[RandSamp[10]], profstep=0.01, profmethod="bin")
pdf(paste(Sys.Date(), "MyData EICs and mass spectra of random mass features.pdf"), 8.5,11)
# 1st column shows the uncorrected EICs.
# 2nd column shows the RT-corrected EICs.
# 3rd column shows the m/z vs. RT for the 1st sample for that compound with a
# dashed horizontal line where the calculated m/z is.
par(mfrow=c(4,3), mar=c(3,3,3,0.5))
for(i in 1:30){
m <- MFs[i]
plot(EIC.uncorrected[[m]], MyData.filledpeaks, groupidx=1, rtrange=60, col=MyColors, main=MFs[m])
mtext(paste(i, MyData.peaks$MassFeature[m]), side=3, line=-1, adj=0, padj=0, cex=0.8)
plot(EIC.corrected[[m]], MyData.filledpeaks, groupidx=1, rtrange=60, col=MyColors)
RT <- MyData.peaks$rt[m]
RTRange <- c(RT-30, RT+30)
mz <- MyData.peaks$mz[m]
mzRange <- c(mz-0.02, mz+0.02)
mzRange.poly.low <- mz- mz*7.5/1e6
mzRange.poly.up <- mz*7.5/1e6 + mz
plotRaw(xset.raw, mzrange=mzRange, rtrange=RTRange, log=FALSE)
abline(h=mz, lty=2, col="gray35")
mtext(paste("abund =", round(MyData.peaks[m, (length(RandSamp))], digits=0)), side=3, line=-1, adj=0, padj=0, cex=0.8)
polygon(c(RTRange[2], RTRange[1], RTRange[1], RTRange[2]),
c(mzRange.poly.up, mzRange.poly.up, mzRange.poly.low, mzRange.poly.low),
col=col2alpha("blue", alpha=0.1), border=NA)
abline(v=RT, lty=2, col="gray35")
}
dev.off()
3
XCMS / Re: Comparing XCMS batch version to XCMSonline
4
XCMS / Re: Why do I have fewer compounds with more samples?
5
XCMS / Why do I have fewer compounds with more samples?
Here's an example of my code:
Code: [Select]
Samples <- list.files(getwd(), pattern="mzdata.xml", full.names=F, recursive=TRUE)
xs1 <- xcmsSet(Samples[1:50], method = "centWave", ppm=15, peakwidth=c(4,12),
snthresh = 5, mzCenterFun="apex", prefilter=c(5,500),
integrate = 1, fitgauss= TRUE)
xs2 <- xcmsSet(Samples[51:150], method = "centWave", ppm=15, peakwidth=c(4,12),
snthresh = 5, mzCenterFun="apex", prefilter=c(5,500),
integrate = 1, fitgauss= TRUE)
xset.grouped <- group(c(xs1, xs2)), method="density", bw=4,
minsamp=1, mzwid=0.007, max=500)
xset.RTcor <- retcor(xset.grouped, method="peakgroups",
missing=20, extra=50, smooth="loess",
family="symmetric", plottype="none")
xset.grouped2 <- group(xset.RTcor, method="density", minsamp=1,
mzwid=0.007, bw=2, max=500)
xset.filledpeaks <- fillPeaks(xset.grouped2)
xset.peaks <- peakTable(xset.filledpeaks, filebase="xset peak table")
If I only align xs1, I get more compounds than if I align both xs1 and xs2.
Thanks for any help!
Laura
6
XCMS / Re: Saving an xcmsSet object
7
XCMS / Saving an xcmsSet object
Many thanks in advance for any help! You guys are great!
Laura
8
XCMS / Re: Generating centWave Images(from Ralf's Bioinformatics pa
9
XCMS / Re: looking for xcms setup help for untargeted metabolomics
Sorry, I should have been checking back here more frequently!
I have just a few thoughts on your issues. First, I wonder if the mass accuracy of your instrument is just too high to work well with XCMS. As someone working in an academic lab, I certainly understand limitations on how nice your instrumentation is, but 400-800 ppm is just awfully high to be doing metabolomics. Most instruments people use for metabolomics would have better mass accuracy than what you're stuck with, and I wonder if there's something in the code for xcms that just doesn't accommodate such a large imprecision in mass measurements. Probably, when Colin Smith, Ralf Tautenhahn, et al. coded xcms, they did so with the mindset that people would be using instruments with relatively good mass accuracy. Have you tried any other metabolomics peak picking and peak alignment software? Some other options: MSInspect SmallMolecule, MetAlign, MZMine. I'm not terribly familiar with them as I've been pretty happy with xcms.
Second, you mentioned that some of your problem peaks are really narrow; I was just rereading Tautenhahn 2008 BMC Bioinformatics, and starting on p. 4 of that paper, they talk about the advantage of fitting chromatographic data with varying widths to the Mexican hat wavelet they describe. This would be the parameter where you set integrate=1 within the xcmsSet command. On p. 8 of that paper, they say, "Optionally, a Gaussian curve is fitted to the feature, using the Nonlinear Least Squares (NLS) implementation of R." That must be gaussfit=TRUE within the same command. Have you tried setting gaussfit=FALSE? I'm not sure how the Gaussian fit works in conjunction with the Mexican hat wavelet function, but maybe they're not playing together nicely in your data. The Mexican hat wavelet sounds as though it IS what you want, though, based on my understanding of this paper and the nature of your data.
Third, your error with mzCenterFun="wMean": Have you tried mzCenterFun="apex"? How close to a Gaussian shape are your peaks? I suspect that the problem lies with your poor mass resolution. I don't know the code, but it probably isn't used to having to include such a large range of m/z to calculate the weighted mean of a single peak. Have you looked at plotRaw to get a visual representation of what your peaks look like on the mass and time axes? Here's an example:
Code: [Select]
xset.raw <- xcmsRaw("filename.mzdata.xml", profstep=0.01, profmethod="bin")When I do this with my data, which was collected on a pretty good QTOF, I can clearly see that at the apex of the peak, I have the best mass accuracy, and at the sides, the mass accuracy is pretty low, which is what you'd expect. What do your data look like? (I'd insert a picture here, but I'm not sure how to do that with this site. It's asking for a URL for the image, and I'm not sure where I'd put it.)
mzRange=c(512.2,512.3)
RTRange=c(705,740)
plotRaw(xset.raw, mzrange=mzRange, rtrange=RTRange, log=FALSE)
Fourth, you asked about peak insertion errors. What do you mean? I wonder if the problem with the fitgauss parameter comes back to poor mass accuracy again.
I'm so sorry that this has been so frustrating for you! I can certainly relate, although I haven't had the exact problems you describe. Is there any way at all that you might get access to a better instrument?
Good luck. I'll try to be better about checking back here more frequently.
Laura
10
XCMS / Re: looking for xcms setup help for untargeted metabolomics
I'm not sure what you mean by finding features manually. How many features are we talking about that you would find by hand? I typically find thousands of mass features in my data after processing by XCMS, and the thought of looking for those manually makes me think I'd look for a new job first! Are you sure that XCMS is the right approach for you? If you're doing a targeted analysis where you just have a bunch of compounds that you're looking for, it might be better to just set up a targeted method using the instrument manufacturer's software.
If you're not finding mass features that you expect to find, i.e. mass features that you're certain are real, there are a few places where you could have settings that are off. First, your xcmsSet parameter: It sounds like you have an older instrument. Is it giving you profile data? If so, then the default method for findPeaks, which is what you've got, is probably fine. You might try tightening the step size to something smaller, but I defer to your expertise on how your own instrument performs in terms of resolution.
I don't use the findPeaks default method within xcmsSet, so I'm not terribly familiar with it. How about trying a different algorithm? Have you tried centWave? You could set it up like this:
Code: [Select]
xcmsSet(Samples, method = "centWave", snthresh = 3, ppm=1000, peakwidth=c(2,20), mzCenterFun="wMean",integrate = 1,fitgauss= TRUE)That's a pretty wide chromatographic window, and the gaussian fit is somewhat flexible, to my understanding.
For the grouping question, I prefer at this point in my data analysis to leave out any filtering by how frequently a mass feature is present. Instead, I filter later, once I've already got my xcms dataset and I'm ready to do some statistical analyses. An example of how I run the group algorithm where "Data.RTcor" is my RT-corrected dataset:
Code: [Select]
group(Data.RTcor, method="density", minsamp=1, mzwid=0.004, bw=10, max=100)One last thing you didn't address: I ALWAYS do recursive peak filling. There have been times that recursion has found peaks that xcms missed the first time around. The code is really straightforward:
Code: [Select]
fillpeaks(Data)
Good luck!
Laura Shireman
11
METLIN / Re: MS/MS Spectrum Match is available now
A question: To see the exact m/z of a peak in an MS/MS spectrum from a METLIN hit, I need to MouseOver that peak. Is it possible to get the m/z and intensities in a csv or other numerical format? I want to compare the peaks I see in the database side by side with the peaks in my spectrum, and it would be much faster if I could just download the numbers rather than mousing over each and writing down the value.
Thanks!
Laura
12
XCMS / Re: XCMS2: collect() doesn't work
Thank you, as usual, for being so helpful, Jan!
Laura
13
XCMS / Re: XCMS2: collect() doesn't work
Error in class(xs) : 'xs' is missing
Thanks, Jan.
Does anyone know of an example with the R code for using XCMS2?
Laura
14
XCMS / Re: XCMS2: collect() doesn't work
Is there another example somewhere that you would recommend? I still don't really understand how to begin, and I'm looking for an example to work from. I tried this:
Code: [Select]
Sample <- list.files(getwd(), pattern="mzdata.xml", full.names=TRUE, recursive=TRUE)
Data.set <- xcmsSet(Sample)
Data.fragments <- xcmsFragments(Data.set)
Data.coll <- collect(Data.set, rt=30)
but that gave me the same error as before when I got to the collect() step.
Laura
15
XCMS / XCMS2: collect() doesn't work
Code: [Select]
Error in function (classes, fdef, mtable) :
unable to find an inherited method for function "collect", for signature "xcmsRaw"
Here is the code I'm using:
Code: [Select]
library(xcms)
Data.raw <- xcmsRaw(filename="20110724pooledplasma-MSMS-06.mzdata.xml", includeMSn=TRUE)
Data.raw
Data.coll <- collect(Data.raw, rt=30)
Any suggestions? Thanks in advance!
Laura