1
Topics
This section allows you to view all Topics made by this member. Note that you can only see Topics made in areas you currently have access to.
Topics - cbroeckl
2
XCMS / sigma fitgauss question
1. why are there peaks in the resultant xcms dataset for which the sigma values is larger - sometimes by an order of magnitude - than the max peakwidth setting?
2. how do I interpret sigma values of 'NA'.
I am guessing that the answer to the latter question is that the gaussian fit is bad, and no mu/sigma/h can be returned. Any insight is appreciated.
Corey
3
XCMS / group or peakTable problem
Here is an example for the compound catechin:
stand1<-xcmsSet(filenames, nSlaves=2, method="centWave", ppm=25, peakwidth=c(1.5,15), mzdiff=0.01, fitgauss=TRUE, verbose.columns=TRUE)
xstand@peaks[which(xstand@peaks[,"mz"] > 291.086 & xstand@peaks[,"mz"] < 291.087),]
mz mzmin mzmax rt rtmin rtmax into intb maxo sn egauss mu sigma h f dppm scale
[1,] 291.0866 291.0862 291.0870 109.8970 106.897 112.896 1349308.18 1347901.04 338777.25 873 0.08150396 245.1626 3.932065 340105.13 4863 2 2
[2,] 291.0863 291.0818 291.0898 111.3708 107.112 117.395 59929.07 59922.21 15187.01 15186 0.11094711 244.9695 4.436207 13578.48 2527 4 -1
scpos scmin scmax lmin lmax sample
[1,] 244 242 246 49 63 51
[2,] -1 -1 -1 48 72 52
So I am getting the molecular ion that I know to be there in the feature list.
I next try to concatenate this xset (stand1) with a 'background' xset, group them, retention time correct, and regroup:
xstand<-c(xset1, stand1)
xset4 <- group(xstand, bw=2, minfrac=0.5, max = 50, mzwid=0.02)
xset5 <- retcor(xset4, method="loess", family = "gaussian", plottype = "mdevden", span=2, missing=round(length(dataset)*0.05, digits=0))
xset6 <- group(xset5, bw=1, minsamp=0, max= 1000, mzwid=0.02)
xset6
An "xcmsSet" object with 52 samples
Time range: -0.4-1330.2 seconds (0-22.2 minutes)
Mass range: 55.0173-1199.8205 m/z
Peaks: 70298 (about 1352 per sample)
Peak Groups: 397
Sample classes: Library_serumC8
Profile settings: method = bin
step = 0.1
Memory usage: 23.1 MB
And this appears to have worked, however, when I generate the peak table and look for my peak of interest, it isn't there.
peaklist <- peakTable(xset6, value="into")
peaklist[which(peaklist[,"mz"] > 291.05 & peaklist[,"mz"] < 291.11),]
...returns an empty table.
The original peak of interest (the molecular ion in this case) is there:
xset6@peaks[which(xset6@peaks[,"mz"] > 291.086 & xset6@peaks[,"mz"] < 291.087),]
mz mzmin mzmax rt rtmin rtmax into intb maxo sn egauss mu sigma h f dppm scale
[1,] 291.0866 291.0862 291.0870 107.1962 104.1962 110.1952 1349308.18 1347901.04 338777.25 873 0.08150396 245.1626 3.932065 340105.13 4863 2 2
[2,] 291.0863 291.0818 291.0898 108.6867 104.4027 114.6857 59929.07 59922.21 15187.01 15186 0.11094711 244.9695 4.436207 13578.48 2527 4 -1
scpos scmin scmax lmin lmax sample
[1,] 244 242 246 49 63 51
[2,] -1 -1 -1 48 72 52
But is not making it into the peak table even though I have tried to set the minfrac or minsamp values to zero, or some trivial non-zero value such as 0.000001. I could work around this and access the @peaks slot, but this seems to be reinventing the wheel. And it isn't a mass accuracy grouping artifact, as there are no peak groups within several dalton of the 291.0866 peak in the xset6 peak table. Any one have any idea how I can get a peakTable which hasn't filtered out the 'rare' features?
Thanks
4
XCMS / peak shape/symmetry?
Corey
5
XCMS / retrieve an averaged spectrum
I am running some infusions of standard compounds and am trying to use the getSpec() function to collect an averaged spectrum for a range of about 30 seconds during which time my standard is eluting.
test<-xcmsRaw("120703_601.CDF", profstep=0.01)
test
spec<-getSpec(test, rtrange=c(45:90))
spec2<-spec[order(spec[,2], decreasing=TRUE),]
spec2[1:20,]
mz intensity
[1,] 132.10220 84482.79
[2,] 132.10226 84476.03
[3,] 132.10229 84468.90
[4,] 132.10216 84434.43
[5,] 132.10213 84392.98
[6,] 132.10240 84370.49
[7,] 132.10243 84337.78
[8,] 132.10246 84293.82
[9,] 132.10205 84216.37
[10,] 132.10262 83990.52
[11,] 132.10196 83972.10
[12,] 132.10170 83246.16
[13,] 86.09711 37450.49
[14,] 86.09715 37449.09
[15,] 86.09710 37446.60
[16,] 86.09720 37443.11
[17,] 86.09707 37436.68
[18,] 86.09725 37416.95
[19,] 86.09694 37371.84
[20,] 86.09693 37367.13
As you can see, I am not getting any averaging of the spectra unless the mass matches exactly, though the documentation for getScan suggests it can be used for averaging multiple scans. Am I missing a parameter setting somewhere (a ppm error, for example), or am I trying to do something the function wasn't designed for? Any advice is greatly appreciated!
Corey
ps. as a secondary but relevent question: any way to subtract background spectrum from a portion of the infusion before the compound elutes?
6
XCMS / XCMS issue on Linux cluster
> sessionInfo()
R version 2.12.2 (2011-02-25)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] xcms_1.26.1
loaded via a namespace (and not attached):
[1] tools_2.12.2
When I try to run XCMS on a cdf format Waters Q-TOF data file I get an error message which I haven't seen before.
> xset <- xcmsSet(filenames, method = "matchedFilter", fwhm = 8, max = 500, snthresh = 3,
step = 0.05, steps = 2, mzdiff = 0.05, index = FALSE, sleep = 0)
111101_TwinGene_0095b01: Error in if (del == 0 && to == 0) return(to) :
missing value where TRUE/FALSE needed
I have run the same data file using the same script on my windows 7 desktop in the past.
> sessionInfo()
R version 2.13.1 (2011-07-08)
Platform: x86_64-pc-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] xcms_1.30.3
>
I have tried to run mzXML formatted data through both, and that format works well on both the Linux and Windows systems. Obviously, there are differences in both the R version and the XCMS package versions. The problem is that I do not have control of the the Linux cluster, so I was hoping to get some guidance on a potential resolution before requesting R and package upgrades. I have had problems with cdf files previously using centWave, but not matchedFilter - but on the Linux system, I get an error message using either method. Also worth noting - I am not using any of the parallel functions for this - just a single file/core. So I assume it is more likely a Unix/Installation issue rather than a parallel/cluster issue. Any advice is appreciated.
7
CAMERA / Isotope annotation and grouping: approach question
If not, then there is a real possibility that two features would be correlated using the validation tools within groupCorr might already be separated at that point, as the groupFWHM tool groups by retention time based on a center around an abundance feature. Does this sound correct?
Also, if a feature is assigned to a pseudospectrum by groupFWHM, and is removed from a pseudospectrum with groupCorr, is there any effort made to regroup the removed feature with other removed features within a range of retention times?
Just trying to understand the overall approach. Thanks.
8
XCMS / calibrate
##peak detection: GC-MS peak
xset <- xcmsSet(dataset, nSlaves=4, method = "matchedFilter", fwhm = 8, max = 500, snthresh = 3, step = 0.5, steps = 2, mzdiff = 0.1, index = FALSE, sleep = 0)
xset
An "xcmsSet" object with 10 samples
Time range: 601.8-2482.9 seconds (10-41.4 minutes)
Mass range: 50.2301-649.5369 m/z
Peaks: 71753 (about 7175 per sample)
Peak Groups: 0
Sample classes: GC-MS_masscorrection
Profile settings: method = bin
step = 0.5
Memory usage: 8.38 MB
I then try to apply the m/z correction using the calibrate function using the command:
xset2<- calibrate(xset, calibrants=147.1, method="shift", mzabs=1, neighbours=3, plotres=TRUE)
Error in .local(object, ...) : No masses close enough!
I know for a fact that there are plenty of masses near this mass in the dataset, but I can't seem to find them. In fact, even if I expand the mzabs to 10, I receive the same message. If I expand the mzabs value to 100, it works, and actuall does seem to shift all the masses by 100da (note the mass range)
> xset2
An "xcmsSet" object with 10 samples
Time range: 601.8-2482.9 seconds (10-41.4 minutes)
Mass range: 144.0489-743.6364 m/z
Peaks: 71753 (about 7175 per sample)
Peak Groups: 0
Sample classes: GC-MS_masscorrection
Profile settings: method = bin
step = 0.5
Memory usage: 8.38 MB
>
I am misusing 'calibrate' or misunderstanding? What am I missing? Thanks,
Corey
9
CAMERA / export pseudospectrum?
10
CAMERA / plot EIC error
Error in pks[, 1] : incorrect number of dimensions
This happens for certain pSpectra groups, only for the plotEICs function, not the plotPsSpectrum function.
11
XCMS / CentWave error
I am playing with XCMS v 1.27.4. I am having an issue with a dataset that I started working with. If i use the matchedFilter peak detection, the process goes smoothly. If I use the centWave method:
xset3<- xcmsSet(dataset, method = "centWave", ppm=25, peakwidth=c(2,40), snthresh=2, integrate=2, mzdiff=0.1, fitgauss=FALSE, verbose.columns=TRUE)
I receive an error message that :
"
11-07-18_test_005501:
Detecting mass traces at 25 ppm ...
% finished: 0 10 20 30 40 50 60 Error in .local(object, ...) :
m/z sort assumption violated ! (scan 670, p 129, current 170.1777 (I=5.06), last 170.1777)
In addition: Warning messages:
1: closing unused connection 6 (<-PMFLAB-04.cvmbs.ColoState.EDU:10187)
2: closing unused connection 5 (<-PMFLAB-04.cvmbs.ColoState.EDU:10187)
3: closing unused connection 4 (<-PMFLAB-04.cvmbs.ColoState.EDU:10187)
4: closing unused connection 3 (<-PMFLAB-04.cvmbs.ColoState.EDU:10187)
"
where 11-07-18_test_005501 is the problem filename. When I use the nSlaves option, I end up with, this message
"
Error in checkForRemoteErrors(val) :
5 nodes produced errors; first error: m/z sort assumption violated ! (scan 670, p 129, current 170.1777 (I=5.06), last 170.1777)
"
and :
> traceback()
4: stop(count, " nodes produced errors; first error: ", firstmsg)
3: checkForRemoteErrors(val)
2: xcmsClusterApply(cl = snowclust, x = argList, fun = findPeaksPar)
1: xcmsSet(dataset, nSlaves = 4, method = "centWave", ppm = 25,
peakwidth = c(2, 40), snthresh = 2, integrate = 2, mzdiff = 0.1,
fitgauss = FALSE, verbose.columns = TRUE)
So apparently there are something about a few of these files that centWave does not like, but matchedFilter is OK with. These are cdf files produced by Markerlynx DataBridge from waters raw files. I am using R 3.13.1. Anyone seen this issue previously?
Thanks
12
CAMERA / Rmpi dependence
Thanks,