Skip to main content
Topic: Get Spectra using xcmsSet object instead of xcmsRaw object (Read 8006 times) previous topic - next topic

Get Spectra using xcmsSet object instead of xcmsRaw object

Hello all,

I am writing a script and need to generate several mass spectra based on a group of peaks that I have in an xcmsSet object.  As far as I can tell, however, there only exist methods for generating mass spectra using an xcmsRaw object.  I really would like to not have to create additional xcmsRaw objects if at all possible since it takes a LOT of time (especially when processing more than 4 samples and picking peaks using centWave).  So, does anyone know how to generate mass spectra using an xcmsSet object or am I out of luck?

Thanks,
Cole Wunderlich

Re: Get Spectra using xcmsSet object instead of xcmsRaw obje

Reply #1
It really depends what you want to achieve. The xcmsSet only contains the picked peaks and not every scan. So you cannot get a "real" spectrum. But you could use the peaktable to generate a kind of pseudo mass spectrum but it will only contain picked peaks. Since you talk about using centWave on an xcmsRaw objects maybe that is what you want?
Blog: stanstrup.github.io

Re: Get Spectra using xcmsSet object instead of xcmsRaw obje

Reply #2
Unfortunately I am interested in the actual spectra from the raw data (ie. entire scans).  I use centWave for generating my xcmsSet objects as well as for my xcmsRaw objects (although in this particular script I am trying to avoid having to create any xcmsRaw objects, if at all possible).  I was under the impression that xcmsSet objects at least contain some information regarding the raw data, since I believe that information is used to generate EICs when calling getEIC().  I was hoping there might also be a way to use the same data source to generate mass spectra instead of EICs, but it sounds like that might not be the case....

Cole

Re: Get Spectra using xcmsSet object instead of xcmsRaw obje

Reply #3
No. An xcmsSet object contains no information about the raw data, just the picked peaks and statistics on those. It only contains links to the original files.

getEIC re-reads the raw from the files if used with an xcmsSet:
Quote
Generate multiple extracted ion chromatograms for m/z values of interest. For xcmsSet objects, reread original raw data and apply precomputed retention time correction, if applicable.

Is it really that slow to make the xcmsRaw objects? Usually it is pretty fast... How big is your files? How many files do you need to do it on? Do you have enough memory?
Blog: stanstrup.github.io

Re: Get Spectra using xcmsSet object instead of xcmsRaw obje

Reply #4
Quote from: "Jan Stanstrup"
Is it really that slow to make the xcmsRaw objects? Usually it is pretty fast... How big is your files? How many files do you need to do it on? Do you have enough memory? 

Yeah, it's pretty darn slow.  While I have never timed it, i'd say it definitely takes over an hour to process 8 samples. The getEIC() method is fairly slow as well.  My files are in the mzXML format and tend to be around 660 MB a piece (after converting from profile to centroid).  I'm using a reverse phase column and a chromatographic method with a run time of about 60 min on a high-res TOF to generate the data.  Memory and processing speed also shouldn't be a concern since I'm working on a windows box with 16G of ram and an i7-3770 CPU (quad core running at 3.40 GHz per core).  The number of files I need to process is variable, but usually the minimum is around 12 (3 control unlabeled, 3 control labeled, 3 mutant unlabeled, 3 mutant labeled, ect.).

Quote from: "Jan Stanstrup"
Quote
Generate multiple extracted ion chromatograms for m/z values of interest. For xcmsSet objects, reread original raw data and apply precomputed retention time correction, if applicable.
I forgot about that, good catch!  I think what I might be looking for then is some way to read the raw data and generate MS without having to go through the peak picking step involved in generating an xcmsRaw object (since, at the point I need the MS in my script, the peak picking would be redundant)

Let me know if there is any other info I can provide you,
Cole

 

Re: Get Spectra using xcmsSet object instead of xcmsRaw obje

Reply #5
Why do you do the peak picking again on the xcmsRaw object? You should be able to use getScan directly on the xcmsRAW if all you want is a single scan.
Be careful about finding your scan if you use retention time correction.

Some ideas:
* If the process is CPU limited you could speed it up by multi-threading it using parApply methods.
* If you can guess approximately which scan you need you can probably speed it up a lot by not reading the whole file. In other words use scanrange in xcmsRaw.


edit: typos.
Blog: stanstrup.github.io

Re: Get Spectra using xcmsSet object instead of xcmsRaw obje

Reply #6
Quote from: "Jan Stanstrup"
Why do you do the peak picking again on the xcmsRaw object?
I don't, that is what I am trying to avoid.  I do peak picking when I generate the xcmsSet object and now want to be able to generate MS without having to instantiate any more objects.

Cole

Re: Get Spectra using xcmsSet object instead of xcmsRaw obje

Reply #7
I ended up rewriting some of the source code so that xcmsSet objects now retain the xcmsRaw objects used to generate them.  I have found this is highly useful because almost every method involving xcmsSet objects seems to want to regenerate the xcmsRaw objects (e.g. retcor.obiwarp, fillPeaks(), getEIC()).  By keeping the raw objects in memory and rewriting the aforementioned functions to operate on them (instead of generating new ones every time)  I have significantly reduced the run time of my code.  Perhaps the devs might consider implementing this functionality as an option for the current xcmsSet class.

Re: Get Spectra using xcmsSet object instead of xcmsRaw obje

Reply #8
You can try a pull request: https://github.com/sneumann/xcms

It is interesting as an option. But with the number of samples most people deal with, most people would run out of memory very fast...
Blog: stanstrup.github.io

Re: Get Spectra using xcmsSet object instead of xcmsRaw obje

Reply #9
True, but if you split the analysis up into batches I think the performance gains would probably be worth it.  I suppose the only instance it wouldn't work would be for Rt alignment since you need all the data loaded at once, but then I suppose you could probably just convert back to a regular xcmsSet object.  I might submit a pull request if I have time to convert the rest of the code, right now I have only set it up to work for the select methods I require.

Cole