Skip to main content
Topic: Feed raw input data to XCMS? is now: deprofile.R (Read 7031 times) previous topic - next topic

Feed raw input data to XCMS? is now: deprofile.R

Hi,

a question: Say I have raw high-resolution .mzML files in profile mode. I don't want to convert them to centroid mode because I want to retain the additional profile information present (e.g. for fine isotope analysis later on). Now I can open this file in R using mzR, and I have written a function which computes a centroid-mode spectrum for every scan.

How can I provide the centroided data (generated directly in R) to XCMS in an xcmsRaw-like object, so I can run centWave on it?

(One reason why I want to do this is that the error obtained by taking the local intensity maxima for the "centroid" m/z value can become quite large. The obtained m/z is much more accurate if I use e.g. an algorithm like the "Exact Mass" algorithm from MZmine, which takes the FWHM m/z value.)

Re: Feed raw input data to XCMS?

Reply #1
1.) (easy work-around) : pre-process your files, write them as mzXML, run centWave on the result.
2.) (good for the community) : implement your algorithm into XCMS, and we'll find a way to integrate it with centWave.
Ideally this would happen directly on the C-Level, as an alternative to the current
Code: [Select]
mzROI.c :
struct scanBuf * getScan(...)

That way it could do the centroidization on the fly.

Re: Feed raw input data to XCMS?

Reply #2
Quote from: "Ralf"
1.) (easy work-around) : pre-process your files, write them as mzXML, run centWave on the result.
2.) (good for the community) : implement your algorithm into XCMS, and we'll find a way to integrate it with centWave.
Ideally this would happen directly on the C-Level
Hi Ralf,

since I do not currently have the "otium" (is there an English word for that? German is "Musse" more or less) to accustom myself to Rcpp style, I did something in-between...
I coded the routine in R vector-operation style instead of using loops. It's not as fast as the original Java implementation or as Rcpp would be, but it's not terribly bad, and since the subsequent centWave takes much longer anyway, it's not a bottleneck for me.

I wrote the routine primarily for my own use, and it's not really tested or anything, but if anyone wants to use it, feel free to do with it whatever you want. Don't blame me if your computer explodes and buries all your valuable data never to be found again :)

deprofile.R: http://pastebin.com/rvPCEDg2

Usage:
Code: [Select]
# from xcmsRaw:
xraw <- xcmsRaw("myfile.mzML")
# by FWHM method:
xraw.sticked <- deprofile.xcmsRaw(xraw, copy=T, method="deprofile.fwhm")
# local maximum: faster, but less accurate especially for "low" resolution
xraw.sticked <- deprofile.xcmsRaw(xraw, copy=T, method="deprofile.localMax")
scan.xraw.profile <- getScan(xraw, 50)
scan.xraw <- getScan(xraw.sticked, 50)
# alternatively directly from scan:
scan.xraw.direct <- deprofile.scan(scan.xraw.profile)

# this can be peakpicked:
xpeaks <- findPeaks(xraw.sticked, method="centWave", ppm=5, snthresh=10, noise=3000, prefilter=c(3,5000))

# the same scan from mzR:
mzrFile <- openMSfile("myfile.mzML")
acqNo <- xraw@acquisitionNum[[50]]
scan.mzML.profile <- mzR::peaks(mzrFile, acqNo)
scan.mzML <- deprofile.scan(scan.mzML.profile)
close(mzrFile)


(Why is the extension ".R" (or ".txt", for that matter) not allowed for attachments in an R-centered forum?  :D )

Re: Feed raw input data to XCMS? is now: deprofile.R

Reply #3
Quote
(Why is the extension ".R" (or ".txt", for that matter) not allowed for attachments in an R-centered forum? :D )

Fixed.

[attachment deleted by admin]

Re: Feed raw input data to XCMS? is now: deprofile.R

Reply #4
Quote
I coded the routine in R vector-operation style instead of using loops. It's not as fast as the original Java implementation or as Rcpp would be, but it's not terribly bad, and since the subsequent centWave takes much longer anyway, it's not a bottleneck for me.

Looks good. If you are interested in it, we can integrate this centroidization method into XCMS.
I have some suggestions to make it even faster. Details via PM.

Ralf

 

Re: Feed raw input data to XCMS? is now: deprofile.R

Reply #5
Sure, if you're interested in adding the function to XCMS, I think it would be a useful addition. If possible we should keep it extensible so that someone can add other algorithms (e.g. the cubic splines used by the OpenMS HiRes feature detector; but that will probably need Rcpp, I can't imagine an easy and fast way to do that one in R...)

But keep in mind that the function could use some more testing :)

Re: Feed raw input data to XCMS? is now: deprofile.R

Reply #6
Quote
If you are interested in it, we can integrate this centroidization method into XCMS.
I can only agree that it's worth including in XCMS, it made a big difference to our results (e.g. improved ranking of correct molecular formulas) even for a small test set and anyone that has profile information available could thus get greater benefit from their data.