Show Posts - cbroeckl

This section allows you to view all Topics made by this member. Note that you can only see Topics made in areas you currently have access to.

Topics - cbroeckl

XCMS / sequential addition of files to an xcms object

May 31, 2017, 08:45:12 AM

I am interested to know whether there is a way to sequentially add files to an xcms object. for example, I am running XCMS, I have completed feature finding on 10 files, and have the features grouped with retention time correction. I now want to add file 11 to this xcms object. From what I can tell, adding file 11 basically means that all the grouping and retention time correction are wiped from the xcms object to enable addition of an 11th file. Is this accurate, or is there a way to do this that I am not thinking of? Thanks,Corey

XCMS / sigma fitgauss question

November 07, 2013, 01:48:47 PM

I am using the centwave algorithm for peak detection, selecting the 'fitgauss=TRUE' option. I am curious to know how best to interpret the values output by fitgauss - in particular sigma. In the centwave algorithm, there is an parameter specifying the min and max peak width, in seconds - I usually use something like c(2:20) for UPLC. When I look at the gaussian sigma values returned, it has raised two questions for me:

1. why are there peaks in the resultant xcms dataset for which the sigma values is larger - sometimes by an order of magnitude - than the max peakwidth setting?
2. how do I interpret sigma values of 'NA'.

I am guessing that the answer to the latter question is that the gaussian fit is bad, and no mu/sigma/h can be returned. Any insight is appreciated.

Corey

XCMS / group or peakTable problem

November 27, 2012, 11:54:55 AM

I am trying to use XCMS to perform peak detection on some authentic standards to get a nice clean spectrum.
Here is an example for the compound catechin:

stand1<-xcmsSet(filenames, nSlaves=2, method="centWave", ppm=25, peakwidth=c(1.5,15), mzdiff=0.01, fitgauss=TRUE, verbose.columns=TRUE)
xstand@peaks[which(xstand@peaks[,"mz"] > 291.086 & xstand@peaks[,"mz"] < 291.087),]

mz mzmin mzmax rt rtmin rtmax into intb maxo sn egauss mu sigma h f dppm scale
[1,] 291.0866 291.0862 291.0870 109.8970 106.897 112.896 1349308.18 1347901.04 338777.25 873 0.08150396 245.1626 3.932065 340105.13 4863 2 2
[2,] 291.0863 291.0818 291.0898 111.3708 107.112 117.395 59929.07 59922.21 15187.01 15186 0.11094711 244.9695 4.436207 13578.48 2527 4 -1
scpos scmin scmax lmin lmax sample
[1,] 244 242 246 49 63 51
[2,] -1 -1 -1 48 72 52

So I am getting the molecular ion that I know to be there in the feature list.

I next try to concatenate this xset (stand1) with a 'background' xset, group them, retention time correct, and regroup:

xstand<-c(xset1, stand1)
xset4 <- group(xstand, bw=2, minfrac=0.5, max = 50, mzwid=0.02)
xset5 <- retcor(xset4, method="loess", family = "gaussian", plottype = "mdevden", span=2, missing=round(length(dataset)*0.05, digits=0))
xset6 <- group(xset5, bw=1, minsamp=0, max= 1000, mzwid=0.02)
xset6

An "xcmsSet" object with 52 samples

Time range: -0.4-1330.2 seconds (0-22.2 minutes)
Mass range: 55.0173-1199.8205 m/z
Peaks: 70298 (about 1352 per sample)
Peak Groups: 397
Sample classes: Library_serumC8

Profile settings: method = bin
step = 0.1

Memory usage: 23.1 MB

And this appears to have worked, however, when I generate the peak table and look for my peak of interest, it isn't there.

peaklist <- peakTable(xset6, value="into")
peaklist[which(peaklist[,"mz"] > 291.05 & peaklist[,"mz"] < 291.11),]

...returns an empty table.

The original peak of interest (the molecular ion in this case) is there:
xset6@peaks[which(xset6@peaks[,"mz"] > 291.086 & xset6@peaks[,"mz"] < 291.087),]

mz mzmin mzmax rt rtmin rtmax into intb maxo sn egauss mu sigma h f dppm scale
[1,] 291.0866 291.0862 291.0870 107.1962 104.1962 110.1952 1349308.18 1347901.04 338777.25 873 0.08150396 245.1626 3.932065 340105.13 4863 2 2
[2,] 291.0863 291.0818 291.0898 108.6867 104.4027 114.6857 59929.07 59922.21 15187.01 15186 0.11094711 244.9695 4.436207 13578.48 2527 4 -1
scpos scmin scmax lmin lmax sample
[1,] 244 242 246 49 63 51
[2,] -1 -1 -1 48 72 52

But is not making it into the peak table even though I have tried to set the minfrac or minsamp values to zero, or some trivial non-zero value such as 0.000001. I could work around this and access the @peaks slot, but this seems to be reinventing the wheel. And it isn't a mass accuracy grouping artifact, as there are no peak groups within several dalton of the 291.0866 peak in the xset6 peak table. Any one have any idea how I can get a peakTable which hasn't filtered out the 'rare' features?

Thanks

XCMS / peak shape/symmetry?

August 21, 2012, 11:50:59 AM

Are there any functions in the XCMS peak picking steps to diagnose peak symmetry? I know that there are functions for fitting a guassian to detected peak as part of the area estimation, can any peak shape values be returned? Thanks.
Corey

XCMS / retrieve an averaged spectrum

July 03, 2012, 04:15:11 PM

Hello XCMS gurus,

I am running some infusions of standard compounds and am trying to use the getSpec() function to collect an averaged spectrum for a range of about 30 seconds during which time my standard is eluting.

test<-xcmsRaw("120703_601.CDF", profstep=0.01)
test
spec<-getSpec(test, rtrange=c(45:90))
spec2<-spec[order(spec[,2], decreasing=TRUE),]
spec2[1:20,]

mz intensity
[1,] 132.10220 84482.79
[2,] 132.10226 84476.03
[3,] 132.10229 84468.90
[4,] 132.10216 84434.43
[5,] 132.10213 84392.98
[6,] 132.10240 84370.49
[7,] 132.10243 84337.78
[8,] 132.10246 84293.82
[9,] 132.10205 84216.37
[10,] 132.10262 83990.52
[11,] 132.10196 83972.10
[12,] 132.10170 83246.16
[13,] 86.09711 37450.49
[14,] 86.09715 37449.09
[15,] 86.09710 37446.60
[16,] 86.09720 37443.11
[17,] 86.09707 37436.68
[18,] 86.09725 37416.95
[19,] 86.09694 37371.84
[20,] 86.09693 37367.13

As you can see, I am not getting any averaging of the spectra unless the mass matches exactly, though the documentation for getScan suggests it can be used for averaging multiple scans. Am I missing a parameter setting somewhere (a ppm error, for example), or am I trying to do something the function wasn't designed for? Any advice is greatly appreciated!

Corey

ps. as a secondary but relevent question: any way to subtract background spectrum from a portion of the infusion before the compound elutes?

XCMS / XCMS issue on Linux cluster

March 30, 2012, 10:36:17 AM

I am trying for the first time to run XCMS on a cluster running linux:

> sessionInfo()
R version 2.12.2 (2011-02-25)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] xcms_1.26.1

loaded via a namespace (and not attached):
[1] tools_2.12.2

When I try to run XCMS on a cdf format Waters Q-TOF data file I get an error message which I haven't seen before.
> xset <- xcmsSet(filenames, method = "matchedFilter", fwhm = 8, max = 500, snthresh = 3,
step = 0.05, steps = 2, mzdiff = 0.05, index = FALSE, sleep = 0)

111101_TwinGene_0095b01: Error in if (del == 0 && to == 0) return(to) :
missing value where TRUE/FALSE needed

I have run the same data file using the same script on my windows 7 desktop in the past.

> sessionInfo()
R version 2.13.1 (2011-07-08)
Platform: x86_64-pc-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] xcms_1.30.3
>

I have tried to run mzXML formatted data through both, and that format works well on both the Linux and Windows systems. Obviously, there are differences in both the R version and the XCMS package versions. The problem is that I do not have control of the the Linux cluster, so I was hoping to get some guidance on a potential resolution before requesting R and package upgrades. I have had problems with cdf files previously using centWave, but not matchedFilter - but on the Linux system, I get an error message using either method. Also worth noting - I am not using any of the parallel functions for this - just a single file/core. So I assume it is more likely a Unix/Installation issue rather than a parallel/cluster issue. Any advice is appreciated.

CAMERA / Isotope annotation and grouping: approach question

November 16, 2011, 05:24:33 PM

The original publication describing CAMERA suggests that the isotope and adduct annotation is performed using a sliding retention time window, such the isotopes with non-identical retention times can e recognized for all features. In the user guides accessed in R by typing findIsotopes? the recommendation, for the sake of performance, is to first group the peaks into pseudospectra. If one first groups peaks using groupFWHM(), then performs isotope and adduct annotation, does the sliding window only apply within a grouped pseudospectrum?

If not, then there is a real possibility that two features would be correlated using the validation tools within groupCorr might already be separated at that point, as the groupFWHM tool groups by retention time based on a center around an abundance feature. Does this sound correct?

Also, if a feature is assigned to a pseudospectrum by groupFWHM, and is removed from a pseudospectrum with groupCorr, is there any effort made to regroup the removed feature with other removed features within a range of retention times?

Just trying to understand the overall approach. Thanks.

XCMS / calibrate

September 29, 2011, 01:55:07 PM

I have some single quad GC-MS data, in which part of the dataset was run using one autotune and the remaing using a different autotune. As a result, the masses are shifted in half the dataset. Oddly enough by a pretty large margin (which I can't explain) - about 0.4-0.5 Da. I looked into the XCMS options and found what I thought would be the savior for this dataset, in the calibrate() function. However, I cannot get it to work for me. I am using a subset of the dataset, five files from tune one, five files from tune two. I use this code:

##peak detection: GC-MS peak
xset <- xcmsSet(dataset, nSlaves=4, method = "matchedFilter", fwhm = 8, max = 500, snthresh = 3, step = 0.5, steps = 2, mzdiff = 0.1, index = FALSE, sleep = 0)
xset

An "xcmsSet" object with 10 samples

Time range: 601.8-2482.9 seconds (10-41.4 minutes)
Mass range: 50.2301-649.5369 m/z
Peaks: 71753 (about 7175 per sample)
Peak Groups: 0
Sample classes: GC-MS_masscorrection

Profile settings: method = bin
step = 0.5

Memory usage: 8.38 MB

I then try to apply the m/z correction using the calibrate function using the command:
xset2<- calibrate(xset, calibrants=147.1, method="shift", mzabs=1, neighbours=3, plotres=TRUE)

Error in .local(object, ...) : No masses close enough!

I know for a fact that there are plenty of masses near this mass in the dataset, but I can't seem to find them. In fact, even if I expand the mzabs to 10, I receive the same message. If I expand the mzabs value to 100, it works, and actuall does seem to shift all the masses by 100da (note the mass range)

> xset2
An "xcmsSet" object with 10 samples

Time range: 601.8-2482.9 seconds (10-41.4 minutes)
Mass range: 144.0489-743.6364 m/z
Peaks: 71753 (about 7175 per sample)
Peak Groups: 0
Sample classes: GC-MS_masscorrection

Profile settings: method = bin
step = 0.5

Memory usage: 8.38 MB
>

I am misusing 'calibrate' or misunderstanding? What am I missing? Thanks,
Corey

CAMERA / export pseudospectrum?

August 17, 2011, 03:24:39 PM

Is it possible to export a text formatted pseodospectrum? I am interested in trying CAMERA's grouping functions as an alternate to AMDIS. Having a text export option would potentially allow to integrate CAMERA output with NIST/custom database searches. Thanks.

CAMERA / plot EIC error

August 15, 2011, 04:03:23 PM

I am receiving an error frequently when trying to plot the EICs for selected pseudospectra:

Error in pks[, 1] : incorrect number of dimensions

This happens for certain pSpectra groups, only for the plotEICs function, not the plotPsSpectrum function.

XCMS / CentWave error

July 25, 2011, 12:48:33 PM

Hello users,

I am playing with XCMS v 1.27.4. I am having an issue with a dataset that I started working with. If i use the matchedFilter peak detection, the process goes smoothly. If I use the centWave method:

xset3<- xcmsSet(dataset, method = "centWave", ppm=25, peakwidth=c(2,40), snthresh=2, integrate=2, mzdiff=0.1, fitgauss=FALSE, verbose.columns=TRUE)

I receive an error message that :

"
11-07-18_test_005501:
Detecting mass traces at 25 ppm ...
% finished: 0 10 20 30 40 50 60 Error in .local(object, ...) :
m/z sort assumption violated ! (scan 670, p 129, current 170.1777 (I=5.06), last 170.1777)
In addition: Warning messages:
1: closing unused connection 6 (<-PMFLAB-04.cvmbs.ColoState.EDU:10187)
2: closing unused connection 5 (<-PMFLAB-04.cvmbs.ColoState.EDU:10187)
3: closing unused connection 4 (<-PMFLAB-04.cvmbs.ColoState.EDU:10187)
4: closing unused connection 3 (<-PMFLAB-04.cvmbs.ColoState.EDU:10187)
"
where 11-07-18_test_005501 is the problem filename. When I use the nSlaves option, I end up with, this message

"
Error in checkForRemoteErrors(val) :
5 nodes produced errors; first error: m/z sort assumption violated ! (scan 670, p 129, current 170.1777 (I=5.06), last 170.1777)
"

and :
> traceback()
4: stop(count, " nodes produced errors; first error: ", firstmsg)
3: checkForRemoteErrors(val)
2: xcmsClusterApply(cl = snowclust, x = argList, fun = findPeaksPar)
1: xcmsSet(dataset, nSlaves = 4, method = "centWave", ppm = 25,
peakwidth = c(2, 40), snthresh = 2, integrate = 2, mzdiff = 0.1,
fitgauss = FALSE, verbose.columns = TRUE)

So apparently there are something about a few of these files that centWave does not like, but matchedFilter is OK with. These are cdf files produced by Markerlynx DataBridge from waters raw files. I am using R 3.13.1. Anyone seen this issue previously?

Thanks

CAMERA / Rmpi dependence

July 21, 2011, 04:25:58 PM

I am getting all up to date on the current XCMS and CAMERA versions, and found that while the mulitcore function is presumably activated (nSlaves doesn't generate an 'not functional' message, as it did in older CAMERA versions), it generates an error message indicating that it is dependent on Rmpi, which there is no Windows 64bit version of. Does this sound right or am I missing something? Strangely enough, the nSlaves option works for XCMS.

Thanks,