Show Posts - cbroeckl

This section allows you to view all Messages made by this member. Note that you can only see Messages made in areas you currently have access to.

Messages - cbroeckl

XCMS / group or peakTable problem

November 27, 2012, 11:54:55 AM

I am trying to use XCMS to perform peak detection on some authentic standards to get a nice clean spectrum.
Here is an example for the compound catechin:

stand1<-xcmsSet(filenames, nSlaves=2, method="centWave", ppm=25, peakwidth=c(1.5,15), mzdiff=0.01, fitgauss=TRUE, verbose.columns=TRUE)
xstand@peaks[which(xstand@peaks[,"mz"] > 291.086 & xstand@peaks[,"mz"] < 291.087),]

mz mzmin mzmax rt rtmin rtmax into intb maxo sn egauss mu sigma h f dppm scale
[1,] 291.0866 291.0862 291.0870 109.8970 106.897 112.896 1349308.18 1347901.04 338777.25 873 0.08150396 245.1626 3.932065 340105.13 4863 2 2
[2,] 291.0863 291.0818 291.0898 111.3708 107.112 117.395 59929.07 59922.21 15187.01 15186 0.11094711 244.9695 4.436207 13578.48 2527 4 -1
scpos scmin scmax lmin lmax sample
[1,] 244 242 246 49 63 51
[2,] -1 -1 -1 48 72 52

So I am getting the molecular ion that I know to be there in the feature list.

I next try to concatenate this xset (stand1) with a 'background' xset, group them, retention time correct, and regroup:

xstand<-c(xset1, stand1)
xset4 <- group(xstand, bw=2, minfrac=0.5, max = 50, mzwid=0.02)
xset5 <- retcor(xset4, method="loess", family = "gaussian", plottype = "mdevden", span=2, missing=round(length(dataset)*0.05, digits=0))
xset6 <- group(xset5, bw=1, minsamp=0, max= 1000, mzwid=0.02)
xset6

An "xcmsSet" object with 52 samples

Time range: -0.4-1330.2 seconds (0-22.2 minutes)
Mass range: 55.0173-1199.8205 m/z
Peaks: 70298 (about 1352 per sample)
Peak Groups: 397
Sample classes: Library_serumC8

Profile settings: method = bin
step = 0.1

Memory usage: 23.1 MB

And this appears to have worked, however, when I generate the peak table and look for my peak of interest, it isn't there.

peaklist <- peakTable(xset6, value="into")
peaklist[which(peaklist[,"mz"] > 291.05 & peaklist[,"mz"] < 291.11),]

...returns an empty table.

The original peak of interest (the molecular ion in this case) is there:
xset6@peaks[which(xset6@peaks[,"mz"] > 291.086 & xset6@peaks[,"mz"] < 291.087),]

mz mzmin mzmax rt rtmin rtmax into intb maxo sn egauss mu sigma h f dppm scale
[1,] 291.0866 291.0862 291.0870 107.1962 104.1962 110.1952 1349308.18 1347901.04 338777.25 873 0.08150396 245.1626 3.932065 340105.13 4863 2 2
[2,] 291.0863 291.0818 291.0898 108.6867 104.4027 114.6857 59929.07 59922.21 15187.01 15186 0.11094711 244.9695 4.436207 13578.48 2527 4 -1
scpos scmin scmax lmin lmax sample
[1,] 244 242 246 49 63 51
[2,] -1 -1 -1 48 72 52

But is not making it into the peak table even though I have tried to set the minfrac or minsamp values to zero, or some trivial non-zero value such as 0.000001. I could work around this and access the @peaks slot, but this seems to be reinventing the wheel. And it isn't a mass accuracy grouping artifact, as there are no peak groups within several dalton of the 291.0866 peak in the xset6 peak table. Any one have any idea how I can get a peakTable which hasn't filtered out the 'rare' features?

Thanks

XCMS / Re: CentWave error

October 19, 2012, 09:01:22 AM

I had two files that generate this error with new write.mxXML function.
The two files are two functions (MS and MSe) derived from the same waters raw file, that was converted to two independent cdf files:

Error in if (is.unsorted(peaks[, "mz"])) { :
missing value where TRUE/FALSE needed

XCMS / Re: CentWave error

October 18, 2012, 05:01:50 PM

and the new 'conversion' tool is working great. Everything has to be devel versions! Thanks again,
Corey

XCMS / Re: CentWave error

October 18, 2012, 04:50:00 PM

Found this online:

"Beginning on October 2, 2012, with the release of Bioconductor 2.11, the way to use the development (devel) version of Bioconductor (2.12) is to install R-devel (R-2.16). Packages can then be installed normally; for example, this will install the devel version of IRanges and its dependencies:"

So I went and downloaded devel R, which is necessary for devel Bioconducor, and apparently one or the other is necessary for devel xcms, becuase now I seem to have the right version installed:

trying URL 'http://bioconductor.org/packages/2.12/bioc/bin/windows/contrib/2.16/xcms_1.35.1.zip'
Content type 'application/zip' length 1871951 bytes (1.8 Mb)
opened URL
downloaded 1.8 Mb

package ‘xcms’ successfully unpacked and MD5 sums checked

> library(xcms)
Loading required package: mzR
Loading required package: Rcpp
>

XCMS / Re: CentWave error

October 17, 2012, 09:11:23 AM

any chance I need R 2.16.0? I am running the latest stable version 2.15.1.

XCMS / Re: CentWave error

October 16, 2012, 01:35:20 PM

> library(Rcpp)
> library(mzR)
> library(xcms)
Error in eval(expr, envir, enclos) :
could not find function ".getNamespace"
In addition: Warning message:
package ‘xcms’ was built under R version 2.16.0
Error : unable to load R code in package ‘xcms’
Error: package/namespace load failed for ‘xcms’
>

Same issue. This is a new R session. Doesn't sound like Jan had this problem, so it is either user error or platform dependent.

XCMS / Re: CentWave error

October 12, 2012, 01:55:57 PM

That did it,
Thanks Ralf

not quite. I actually must have been installing the latest stable version. I am still getting the same error when I install the devel verion:

Loading required package: mzR
Loading required package: Rcpp
Error in eval(expr, envir, enclos) :
could not find function ".getNamespace"
In addition: Warning message:
package ‘xcms’ was built under R version 2.16.0
Error : unable to load R code in package ‘xcms’
Error: package/namespace load failed for ‘xcms’

I am on 64 bit windows, FWIW.

XCMS / Re: CentWave error

October 12, 2012, 01:45:28 PM

alright,
what am I doing wrong here:

I went here to download the version:
http://www.bioconductor.org/packages/de ... /xcms.html

installed from the download folder in R.

Installation went fine, but

> library(xcms)
Loading required package: mzR
Loading required package: Rcpp
Error in eval(expr, envir, enclos) :
could not find function ".getNamespace"
In addition: Warning message:
package ‘xcms’ was built under R version 2.16.0
Error : unable to load R code in package ‘xcms’
Error: package/namespace load failed for ‘xcms’

If I download via:
source("http://bioconductor.org/biocLite.R")
biocLite("xcms")

I get an older version that does not generate this error.
I even uninstalled R and reinstalled R 2.15.1.

XCMS / Re: CentWave error

October 12, 2012, 09:49:18 AM

Thanks Steffen,

I will give it a try soon!

Corey

XCMS / Re: CentWave error

October 09, 2012, 04:40:46 PM

FYI:

I figured out a way to modify my R script, and run it from the windows command prompt in batch. It is far from elegant, but it appears functional and requires little person-time (still lots of computer time, C'est la vie). To do so, open the command prompt and change the wd to the R directory
(the R folder should contain the R executable file Rscript.exe); in my case:

cd C:Program FilesRR-2.15.1binx64

Next: Create and R script (I called it convert1.R):

##load xcms library
rm(list=ls(all=TRUE))
library(xcms)
library(snow)
library(ncdf)
library(caTools)
library(XML)

setwd("C:/cdf_mzdata")
print(getwd())
seq<-read.csv(file="seq.csv", header=TRUE, row.names=1)
dim(seq)
dataset<-list.files(getwd(), pattern="CDF", recursive = FALSE)
length(dataset)
dataset<-as.vector(sapply(row.names(seq), grep, x = dataset, value = TRUE))
length(dataset)

i<-length(list.files(getwd(), pattern="mzData", recursive = FALSE))+1
xr <- xcmsRaw(dataset, profstep=0)
name<- dataset
name<-substr(name, 1, nchar(name)-4)
write.mzdata(xr, file=paste(name, ".mzData", sep=""))
rm(xr)
gc()
q()

Save this script in the same R folder.
One more step. Create a 'batch' file called 'batch.bat' which contains
this:
Rscript convert1.R
Rscript convert1.R
Rscript convert1.R
Rscript convert1.R
Rscript convert1.R
Rscript convert1.R
Rscript convert1.R
Rscript convert1.R
Rscript convert1.R

with the same line repeated for as many files as you have in your directory. The R script counts the number of converted .mzData files and adds one to that number, so it advances one file each time.

Then at the command prompt, type batch.bat and hit enter.

I am running it now and it seems to have bypassed the memory leak issue, as R is closed after each conversion.

XCMS / Re: CentWave error

October 09, 2012, 01:13:01 PM

Ralf,

You mentioned earlier in this thread a 'corrupt file' phenomenon with cdf format. This recommendation was what I was following in attempting to convert to mzData. If it is somehow corrupted in cdf form, it becomes uncorrpted in mzData form. Further, I tried (once) to use the write.cdf() function and the rewritten cdf file also failed to work with centWave.

XCMS / Re: CentWave error

October 09, 2012, 12:44:40 PM

Interesting. I will try it.

That being said, the write.mzData function output is functional with centWave. I have tried it on many files successfully. And it never works on the cdf files it was derived from. I have looked at the centwave output and I beleive it is working properly on the mzdata files converted by XCMS::write.mzdata().

I was under the impression that the lockmass fill function that Paul was working on was more to correct for the signal gap to provide better quantitative values for XCMS output, rather than to allow functionality.

XCMS / Re: CentWave error

October 09, 2012, 11:48:37 AM

I am collecting Waters Raw data. I need to get it into a usable format for processing in XCMS. Waters conversion tool can convert to cdf, but not much else. They do have tools (in proteomics packages) that can convert to mzXML, but they require profile data - I have centroid. As far as I can tell, there are no tools in the Waters packages to convert centroid data to anything other than cdf (with respect to XCMS compatibility, ASCII is the only other output format).

I am operating a Q-TOF, which requires a lock-mass correction. As waters stores its raw data, it is not lockmass corrected, but is corrected on the fly either during conversion to CDF or for viewing in Waters software. I looked into Proteowizard, which will convert to mzXML, but does not perform lock-mass correction - an issue they are working with waters with. So the mzXML masses are off - accurate mass data is gone.

I can use cdf format using the matchedFilter algorithm, but the centWave algorithm does not work with cdf data as it is written by DataBridge, waters conversion tool. I want to use centWave for this application because 1. it is supposed to be better for accurate mass data, and 2. it can give me some gaussian fit parameters for each peak, which aid in downstream interpretation.

As an aside, we can collecting data using MSe acquisition, and a nice perk of the Databridge conversion is that raw data files are split into independent data files for function 1: MS; function 2: MSe, and function 3: lockmass. This means that I can perform peak detection in XCMS on both the MS and MSe functions, a critical aspect for the workflow we are developing for mining indiscriminant MSMS data. I am trying to talk waters into offering the option to split their raw files during conversion to mxXML in proteowizard as well, but it is slow going - in fact I have no idea if, much less when.

I feel like I am fighting data conversion problems with every turn. It is not the fault of the XCMS developers at all - I was just hoping to use the write mzData function to get all my CDF files into a format that centWave could read, so I could actually use the peak detection algorithm best suited to the data.

XCMS / Re: CentWave error

October 09, 2012, 10:37:09 AM

Sorry Ralf,

I viewed it as a continuation of this thread, or I would have started a new one.

I downloaded OpenMS to try it and it doesn't seem to be able to read cdf files. I think I am stuck for the time being. I am trying to push waters to help with the conversion options, but that is moving slowly. Proteowizard has a tool which doesn't use the lockmass data, so mass accuracy is compromised.

XCMS / Re: CentWave error

October 08, 2012, 12:55:09 PM

addendum

I just realized after starting a new batch that I am actually running into huge memory consumption after < 10 files, and then it starts running slow for a long time before running out of memory.