Skip to main content

Topics

This section allows you to view all Topics made by this member. Note that you can only see Topics made in areas you currently have access to.

Topics - cbroeckl

1
XCMS / sequential addition of files to an xcms object
I am interested to know whether there is a way to sequentially add files to an xcms object.  for example, I am running XCMS, I have completed feature finding on 10 files, and have the features grouped with retention time correction.  I now want to add file 11 to this xcms object.  From what I can tell, adding file 11 basically means that all the grouping and retention time correction are wiped from the xcms object to enable addition of an 11th file.  Is this accurate, or is there a way to do this that I am not thinking of?  Thanks,Corey
2
XCMS / sigma fitgauss question
I am using the centwave algorithm for peak detection, selecting the 'fitgauss=TRUE' option.  I am curious to know how best to interpret the values output by fitgauss - in particular sigma.  In the centwave algorithm, there is an parameter specifying the min and max peak width, in seconds - I usually use something like c(2:20) for UPLC.  When I look at the gaussian sigma values returned, it has raised two questions for me:

1. why are there peaks in the resultant xcms dataset for which the sigma values is larger - sometimes by an order of magnitude - than the max peakwidth setting?
2. how do I interpret sigma values of 'NA'. 

I am guessing that the answer to the latter question is that the gaussian fit is bad, and no mu/sigma/h can be returned. Any insight is appreciated.

Corey
3
XCMS / group or peakTable problem
I am trying to use XCMS to perform peak detection on some authentic standards to get a nice clean spectrum. 
Here is an example for the compound catechin:

stand1<-xcmsSet(filenames, nSlaves=2, method="centWave", ppm=25, peakwidth=c(1.5,15), mzdiff=0.01, fitgauss=TRUE, verbose.columns=TRUE)
xstand@peaks[which(xstand@peaks[,"mz"] > 291.086 & xstand@peaks[,"mz"] < 291.087),]

          mz    mzmin    mzmax      rt  rtmin  rtmax      into      intb      maxo    sn    egauss      mu    sigma        h    f dppm scale
[1,] 291.0866 291.0862 291.0870 109.8970 106.897 112.896 1349308.18 1347901.04 338777.25  873 0.08150396 245.1626 3.932065 340105.13 4863    2    2
[2,] 291.0863 291.0818 291.0898 111.3708 107.112 117.395  59929.07  59922.21  15187.01 15186 0.11094711 244.9695 4.436207  13578.48 2527    4    -1
    scpos scmin scmax lmin lmax sample
[1,]  244  242  246  49  63    51
[2,]    -1    -1    -1  48  72    52

So I am getting the molecular ion that I know to be there in the feature list. 

I next try to concatenate this xset (stand1) with a 'background' xset, group them, retention time correct, and regroup:

xstand<-c(xset1, stand1)
xset4 <- group(xstand, bw=2, minfrac=0.5, max = 50, mzwid=0.02)
xset5 <- retcor(xset4,  method="loess", family = "gaussian", plottype = "mdevden", span=2, missing=round(length(dataset)*0.05, digits=0))
xset6 <- group(xset5, bw=1, minsamp=0, max= 1000, mzwid=0.02)
xset6

An "xcmsSet" object with 52 samples

Time range: -0.4-1330.2 seconds (0-22.2 minutes)
Mass range: 55.0173-1199.8205 m/z
Peaks: 70298 (about 1352 per sample)
Peak Groups: 397
Sample classes: Library_serumC8

Profile settings: method = bin
                  step = 0.1

Memory usage: 23.1 MB

And this appears to have worked, however, when I generate the peak table and look for my peak of interest, it isn't there. 

peaklist <- peakTable(xset6, value="into")
peaklist[which(peaklist[,"mz"] > 291.05 & peaklist[,"mz"] < 291.11),]

...returns an empty table. 

The original peak of interest (the molecular ion in this case) is there:
xset6@peaks[which(xset6@peaks[,"mz"] > 291.086 & xset6@peaks[,"mz"] < 291.087),]

      mz    mzmin    mzmax      rt    rtmin    rtmax      into      intb      maxo    sn    egauss      mu    sigma        h    f dppm scale
[1,] 291.0866 291.0862 291.0870 107.1962 104.1962 110.1952 1349308.18 1347901.04 338777.25  873 0.08150396 245.1626 3.932065 340105.13 4863    2    2
[2,] 291.0863 291.0818 291.0898 108.6867 104.4027 114.6857  59929.07  59922.21  15187.01 15186 0.11094711 244.9695 4.436207  13578.48 2527    4    -1
    scpos scmin scmax lmin lmax sample
[1,]  244  242  246  49  63    51
[2,]    -1    -1    -1  48  72    52

But is not making it into the peak table even though I have tried to set the minfrac or minsamp values to zero, or some trivial non-zero value such as 0.000001.  I could work around this and access the @peaks slot, but this seems to be reinventing the wheel.  And it isn't a mass accuracy grouping artifact, as there are no peak groups within several dalton of the 291.0866 peak in the xset6 peak table.  Any one have any idea how I can get a peakTable which hasn't filtered out the 'rare' features?

Thanks
4
XCMS / peak shape/symmetry?
Are there any functions in the XCMS peak picking steps to diagnose peak symmetry?  I know that there are functions for fitting a guassian to detected peak as part of the area estimation, can any peak shape values be returned?  Thanks.
Corey
5
XCMS / retrieve an averaged spectrum
Hello XCMS gurus,

I am running some infusions of standard compounds and am trying to use the getSpec() function to collect an averaged spectrum for a range of about 30 seconds during which time my standard is eluting.

test<-xcmsRaw("120703_601.CDF", profstep=0.01)
test
spec<-getSpec(test, rtrange=c(45:90))
spec2<-spec[order(spec[,2], decreasing=TRUE),]
spec2[1:20,]

            mz intensity
 [1,] 132.10220  84482.79
 [2,] 132.10226  84476.03
 [3,] 132.10229  84468.90
 [4,] 132.10216  84434.43
 [5,] 132.10213  84392.98
 [6,] 132.10240  84370.49
 [7,] 132.10243  84337.78
 [8,] 132.10246  84293.82
 [9,] 132.10205  84216.37
[10,] 132.10262  83990.52
[11,] 132.10196  83972.10
[12,] 132.10170  83246.16
[13,]  86.09711  37450.49
[14,]  86.09715  37449.09
[15,]  86.09710  37446.60
[16,]  86.09720  37443.11
[17,]  86.09707  37436.68
[18,]  86.09725  37416.95
[19,]  86.09694  37371.84
[20,]  86.09693  37367.13

As you can see, I am not getting any averaging of the spectra unless the mass matches exactly, though the documentation for getScan suggests it can be used for averaging multiple scans.  Am I missing a parameter setting somewhere (a ppm error, for example), or am I trying to do something the function wasn't designed for?  Any advice is greatly appreciated!

Corey

ps.  as a secondary but relevent question:  any way to subtract background spectrum from a portion of the infusion before the compound elutes?
6
XCMS / XCMS issue on Linux cluster
I am trying for the first time to run XCMS on a cluster running linux: 

> sessionInfo()
R version 2.12.2 (2011-02-25)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8      LC_NUMERIC=C             
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8   
 [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8 
 [7] LC_PAPER=en_US.UTF-8      LC_NAME=C               
 [9] LC_ADDRESS=C              LC_TELEPHONE=C           
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C     

attached base packages:
[1] stats    graphics  grDevices utils    datasets  methods  base   

other attached packages:
[1] xcms_1.26.1

loaded via a namespace (and not attached):
[1] tools_2.12.2

When I try to run XCMS on a cdf format Waters Q-TOF data file I get an error message which I haven't seen before.
> xset <- xcmsSet(filenames, method = "matchedFilter", fwhm = 8, max = 500, snthresh = 3,
            step = 0.05, steps = 2, mzdiff = 0.05, index = FALSE, sleep = 0)

111101_TwinGene_0095b01: Error in if (del == 0 && to == 0) return(to) :
  missing value where TRUE/FALSE needed

I have run the same data file using the same script on my windows 7 desktop in the past. 

> sessionInfo()
R version 2.13.1 (2011-07-08)
Platform: x86_64-pc-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          LC_TIME=English_United States.1252   

attached base packages:
[1] stats    graphics  grDevices utils    datasets  methods  base   

other attached packages:
[1] xcms_1.30.3     
>

I have tried to run mzXML formatted data through both, and that format works well on both the Linux and Windows systems.  Obviously, there are differences in both the R version and the XCMS package versions.  The problem is that I do not have control of the the Linux cluster, so I was hoping to get some guidance on a potential resolution before requesting R and package upgrades.  I have had problems with cdf files previously using centWave, but not matchedFilter - but on the Linux system, I get an error message using either method.  Also worth noting - I am not using any of the parallel functions for this - just a single file/core.  So I assume it is more likely a Unix/Installation issue rather than a parallel/cluster issue.  Any advice is appreciated.
7
CAMERA / Isotope annotation and grouping: approach question
The original publication describing CAMERA suggests that the isotope and adduct annotation is performed using a sliding retention time window, such the isotopes with non-identical retention times can e recognized for all features.  In the user guides accessed in R by typing findIsotopes? the recommendation, for the sake of performance, is to first group the peaks into pseudospectra.  If one first groups peaks using groupFWHM(), then performs isotope and adduct annotation, does the sliding window only apply within a grouped pseudospectrum? 

If not, then there is a real possibility that two features would be correlated using the validation tools  within groupCorr might already be separated at that point, as the groupFWHM tool groups by retention time based on a center around an abundance feature.  Does this sound correct? 

Also, if a feature is assigned to a pseudospectrum by groupFWHM, and is removed from a pseudospectrum with groupCorr, is there any effort made to regroup the removed feature with other removed features within a range of retention times? 

Just trying to understand the overall approach.  Thanks.
8
XCMS / calibrate
I have some single quad GC-MS data, in which part of the dataset was run using one autotune and the remaing using a different autotune.  As a result, the masses are shifted in half the dataset.  Oddly enough by a pretty large margin (which I can't explain) - about 0.4-0.5 Da.  I looked into the XCMS options and found what I thought would be the savior for this dataset, in the calibrate() function.  However, I cannot get it to work for me.  I am using a subset of the dataset, five files from tune one, five files from tune two.  I use this code: 

##peak detection: GC-MS peak
xset <- xcmsSet(dataset, nSlaves=4, method = "matchedFilter", fwhm = 8, max = 500, snthresh = 3, step = 0.5, steps = 2, mzdiff = 0.1, index = FALSE, sleep = 0)
xset

An "xcmsSet" object with 10 samples

Time range: 601.8-2482.9 seconds (10-41.4 minutes)
Mass range: 50.2301-649.5369 m/z
Peaks: 71753 (about 7175 per sample)
Peak Groups: 0
Sample classes: GC-MS_masscorrection

Profile settings: method = bin
                  step = 0.5

Memory usage: 8.38 MB

I then try to apply the m/z correction using the calibrate function using the command:
xset2<- calibrate(xset, calibrants=147.1, method="shift", mzabs=1, neighbours=3, plotres=TRUE)

Error in .local(object, ...) : No masses close enough!

I know for a fact that there are plenty of masses near this mass in the dataset, but I can't seem to find them.  In fact, even if I expand the mzabs to 10, I receive the same message.  If I expand the mzabs value to 100, it works, and actuall does seem to shift all the masses by 100da (note the mass range)

> xset2
An "xcmsSet" object with 10 samples

Time range: 601.8-2482.9 seconds (10-41.4 minutes)
Mass range: 144.0489-743.6364 m/z
Peaks: 71753 (about 7175 per sample)
Peak Groups: 0
Sample classes: GC-MS_masscorrection

Profile settings: method = bin
                  step = 0.5

Memory usage: 8.38 MB
>

I am misusing 'calibrate' or misunderstanding?  What am I missing?  Thanks,
Corey
9
CAMERA / export pseudospectrum?
Is it possible to export a text formatted pseodospectrum?  I am interested in trying CAMERA's grouping functions as an alternate to AMDIS.  Having a text export option would potentially allow to integrate CAMERA output with NIST/custom database searches.  Thanks.
10
CAMERA / plot EIC error
I am receiving an error frequently when trying to plot the EICs for selected pseudospectra:

Error in pks[, 1] : incorrect number of dimensions

This happens for certain pSpectra groups, only for the plotEICs function, not the plotPsSpectrum function.
11
XCMS / CentWave error
Hello users,

I am playing with XCMS v 1.27.4.  I am having an issue with a dataset that I started working with.  If i use the matchedFilter peak detection, the process goes smoothly.  If I use the centWave method:

xset3<- xcmsSet(dataset, method = "centWave", ppm=25, peakwidth=c(2,40), snthresh=2, integrate=2, mzdiff=0.1, fitgauss=FALSE, verbose.columns=TRUE)

I receive an error message that :

"
11-07-18_test_005501:
 Detecting mass traces at 25 ppm ...
 % finished: 0 10 20 30 40 50 60 Error in .local(object, ...) :
  m/z sort assumption violated ! (scan 670, p 129, current 170.1777 (I=5.06), last 170.1777)
In addition: Warning messages:
1: closing unused connection 6 (<-PMFLAB-04.cvmbs.ColoState.EDU:10187)
2: closing unused connection 5 (<-PMFLAB-04.cvmbs.ColoState.EDU:10187)
3: closing unused connection 4 (<-PMFLAB-04.cvmbs.ColoState.EDU:10187)
4: closing unused connection 3 (<-PMFLAB-04.cvmbs.ColoState.EDU:10187)
"
where 11-07-18_test_005501 is the problem filename.  When I use the nSlaves option, I end up with,  this message

"
Error in checkForRemoteErrors(val) :
  5 nodes produced errors; first error: m/z sort assumption violated ! (scan 670, p 129, current 170.1777 (I=5.06), last 170.1777)
"

and :
> traceback()
4: stop(count, " nodes produced errors; first error: ", firstmsg)
3: checkForRemoteErrors(val)
2: xcmsClusterApply(cl = snowclust, x = argList, fun = findPeaksPar)
1: xcmsSet(dataset, nSlaves = 4, method = "centWave", ppm = 25,
      peakwidth = c(2, 40), snthresh = 2, integrate = 2, mzdiff = 0.1,
      fitgauss = FALSE, verbose.columns = TRUE)


So apparently there are something about a few of these files that centWave does not like, but matchedFilter is OK with.  These are cdf files produced by Markerlynx DataBridge from waters raw files.  I am using R 3.13.1.  Anyone seen this issue previously?

Thanks
12
CAMERA / Rmpi dependence
I am getting all up to date on the current XCMS and CAMERA versions, and found that while the mulitcore function is presumably activated (nSlaves doesn't generate an 'not functional' message, as it did in older CAMERA versions), it generates an error message indicating that it is dependent on Rmpi, which there is no Windows 64bit version of.  Does this sound right or am I missing something?  Strangely enough, the nSlaves option works for XCMS. 

Thanks,