I'm also not familiar with SRM/MRM data. But we have implemented a readSRMData function in MSnbase to read chromatographic data from mzML files. Would be nice to know if that works for you and what is missing. Note also that xcms can do now peak detection also on purely chromatographic data (i.e. Chromatogram/Chromatograms classes which are returned by the readSRMData function).
Last post by Pauline -
I used the MZmine2 software to analyze GC-MS data and NIST MS Search to identify the peak list.
But I found that using the same parameter settings, I got different results.
My parameters are set as follows:
Spectrum RT tolerance: 3%
Max. peaks per spectrum:10
Must have same identities: yes
Min. match factor: 700
Min. reverse match factor: 700
But I am not sure that the settings for Spectrum RT tolerance and Max. peaks per spectrum are correct?
What references can I refer to?
Thank you very much for your help.
Last post by CoreyG -
I'll take a look at making the changes and generating a pull request (never done one before).
No problem regarding xcms centWave. I don't think I'd be brave enough to suggest changes there.
I've used skyline (https://skyline.ms/project/home/software/Skyline/begin.view) a fair bit in the past for metabolomics SRM data analysis. You can export peak areas, retention times (apex, start of integration, end of integration) and FWHM fairly easily (using document grid). I don't know if it generates a metric for noise (or S/N)...
The hardest part when getting started, is that you have to manually specify what transitions you want to look at (edit->Insert->Transition list). Skyline won't read the transition list from a file automatically.
I am looking for free/open source software that can process files from SRM experiments (converted to an open format) and provide parameters such as peak start/end at x% height, fwhm, noise etc. Does anyone have any suggestions?
Thanks for sharing your ideas. If you would like some changes in xcms I would however appreciate if you open an issue at the xcms github repository as this enables me also to keep track of what to do (and what was done) - a pull request would actually be even better
Regarding the code in xcms centWave - I did not write that code and I am veeerrry hesitant to change anything within it as this will affect all xcms users.
thanks again, jo
Last post by CoreyG -
Thanks for looking into and fixing the error - very much appreciated.
I was quite intrigued that you said fillChromPeaks always uses the adjusted retention time, so I looked a bit deeper into the code.
It seems fairly simple to allow the ability to integrate using the original rt range.
If you included another parameter on getChromPeakData to select whether switch back to the unadjusted rtrange, stored the unadjusted rtime, figured out which index of rtim is the rtmin and rtmax, then use those indexes to get the original rt for use in the calculation of res[,"into"]
.getChromPeakData <- function(object, peakArea, sample_idx,
However, this highlighted something else in the code that felt odd to me (again, I'm making a lot of assumptions).
By using 'rtr - rtr' in the calculation of "into", don't we always end up overestimating the area of the peak?
rtr comes from the medians of other samples, but getChromPeakData integrates using scans found between these limits. So the rt range of where it integrates is notionally smaller than 'rtr - rtr'. In the example above, rtrange is indeed smaller (with unadjusted=FALSE).
Could we iterate over peakArea and calculate new rtmin and rtmax based on the actual rtime?
I'm not sure how centWave integrates peaks and how the rtmin and rtmax are chosen. So maybe this doesn't make sense...
Last post by metabolon1 -
johannes.rainer, I tried your suggested code. It seems that my largest files are ~150MB. Using this number to make a conservative estimate, I would need about 12GB of RAM to hold 78 raw data files in memory, which is well below the 350+ GB on the server. But as you said, there are also other processes/objects that need RAM.
CoreyG mentioned that 4 threads keeps the RAM below 16 GB on a low spec desktop. So roughly 4 GB per core, which is much higher than the 150 MB core estimated above. But also, you mentioned that the fillChromPeaks is the most memory intensive process in your script, requiring a limit of 1 thread on a 32GB system. I've also noticed that fillChromPeaks is the step that takes the longest.
It does seem like I'll need to do some optimization on our system. Any advice on how to determine how stable the system is with a given number of cores? I don't mind starting with a large # of available cores and working down, as CoreyG suggests. However, for various reasons, I do not want to test stability by whether or not it crashes the server. Do you think that using the 'ulimit' approach to limit RAM would help to prevent the server from crashing? Can an I/O bottleneck cause a crash?
Perhaps I could work through the script step by step, optimizing the number of cores at each step before moving on to the next...
Thanks for clarifying!
Regarding your concern, the fillChromPeaks will always use the adjusted retention times if retention time adjustment has been performed. So, the results should be the same, with or without applyAdjustedRtime.
Regarding the error: I fixed that. You can install the updated version from github:
For the developmental version you are using:
For the Bioconductor 3.8 release (R-3.5.x)
devtools::install_github("sneumann/xcms", ref = "RELEASE_3_8")