Skip to main content
Topic: Reduce file size (Read 6373 times) previous topic - next topic

Reduce file size

This tip was brought to you by Tony Larson.

I would add that apart from going to Linux to use yet more memory (which you must have physically available!), you can also do other things to reduce the size of your xcmsSet and thus save memory and increase processing speed.

The easiest solution is to reduce the size of your .cdf, .mzXML or .mzData input files. You can do this using the tools available in OpenMS, which can be downloaded and easily installed under 32 bit Windows (Linux installation also possible but tricky). Once installed, openMS allows you to trim source .cdf, .mzXML or .mzData files to a specific start and stop time, reduce the m/z scan range, or even re-sample the data to fewer scans/unit time. These operations can all be run as DOS batch files on your dataset (but be aware that .cdf files cannot be read by openms in a 64bit OS (Windows or Linux) environment, so if you need to process .cdf files you must install openms in a 32 bit environment, which is the standard Windows OS). In xcms, there is very limited ability to filter data from a source file before processing - as far as I know, this is limited to specifying a scanrange if you use the centWave method in xcmsSet().

Of course, all these parameters should ideally be optimized when you first acquire the data from your instrument - but often the benefit of hindsight reveals that some post-acquisition data reduction would be beneficial.

Finally, be aware that limiting the input file size will buy you more input files/ output vector in R, and hence allow you to process more input files. However, you will eventually come up to the ~3Gb hard-limit in Windows if you process a very large number of input files, especially when you come to use xcms functions like fillPeaks() where the entire data matrix is loaded into memory. Then Linux is the only option, and this should be running on a system with lots more RAM than a typical Windows system.
~~
H. Paul Benton
Scripps Research Institute
If you have an error with XCMS Online please send me the JOBID and submit an error via the XCMS Online contact page