Metabolomics Society Forum

Scientific themes => Compound identification => Topic started by: cvdlest on June 26, 2012, 08:26:02 AM

Title: Analyzing SRM / MRM data
Post by: cvdlest on June 26, 2012, 08:26:02 AM
In our lab SRM (=MRM) analysis is frequently used to quantify multiple components by LC/MS.


To use SRM data in combination with xcms a number of problems arise, since the data-structure of SRM (mzXML) files is not suitable for the native xcms software.

In each record of a SRM mzXML file the following information is stored.

Some acquisition information
RetentionTime
PrecursorMZ
basePeakMZ  (fragment ion), and its intensity

See an example below:
<scan num="13767"
  msLevel="2"
  peaksCount="1"
  polarity="+"
  scanType="MRM"
  retentionTime="PT828.196S"
  lowMz="263.2"
  highMz="263.2"
  basePeakMz="263.2"
  basePeakIntensity="50"
  totIonCurrent="49"
  collisionEnergy="32" >
  <precursorMz precursorIntensity="0" >599.5</precursorMz>
  <peaks precision="32"
    byteOrder="network"
    pairOrder="m/z-int" >Q4OZmkJIAAA=</peaks>
  </scan>

The next SRM transition will have its own retention time, precursorMZ and basePeakMZ.
Etc.

Xcms analysis of such data gives rise to three problems:

1. These data files contain no msLevel=”1” data!
2. Xcms analysis is based on the precursorMZ, and there can be multiple SRM transitions with an identical precursorMZ value
3. Each SRM transition is measured at a certain retention time the next a little latter etc. This will produce gaps between two successive specific SRMs.
xcms expects full scans at successive retentions times. eg, if a method contains 20 SRM transitions, and a peak appears in the first SRM transition. The a signal will be detected at that time point, after that 19 time point will generated with no signal.

To overcome these problems I altered the ramp.R script in the xcms source!
The main points are:
1. Determine the cycle-time (CycleT)
CycleT <- rtdiff$values[which(rtdiff$length == max(rtdiff$length))]
# CycleT is the most occurring time difference between two equal MRM-transitions
# CycleT <- quantile(rt[mzmaxscans][2:(length(mzmaxscans))]-rt[mzmaxscans][1:(length(mzmaxscans)-1)])[3] # gives the same result
2. Index the precursorMZ to an unique mz
   mz <- floor(scanHeaders$precursorMZ[scans])+ floor(sipeaks$mz)/1000 + floor(scanHeaders$collisionEnergy[scans])/1000000
          # mz <- Q1 mass (without decimals) then at the decimal place the fragment mass, followed by the collision energy (Just to be sure it is unique).
          # Thus;  mz =  Q1.Q3CE  (I assumed that Q3 is always smaller that 1000 m/z)
3. Transform the retention times:
        rt = floor(rt/CycleT)*CycleT+(CycleT/2) 
          # Each SRM cycle must be performed on a single acquisition time
          # consequently the peaks are shifted by maximally CycleT/2 seconds
          # we have to take this for granted!
4. Sort the data by RT and MZ, and create the new correct data-structure
5. return the data


Below the code is listed. I also attached the ramp.R script file.
## begin added code
    # for some reason the "MRM" factor ended up in the precursorIntensity column in my version of xcms (it should be in the scantype column) .
    # I also added "SRM" but I don't know if this is necessary
    if ("MRM" %in% levels(scanHeaders$precursorIntensity) | "MRM" %in% levels(scanHeaders$scanType) | "SRM" %in% levels(scanHeaders$precursorIntensity) | "SRM" %in% levels(scanHeaders$scanType)) { 
   
    # I did the same as the original ramp (work around buggy RAMP indexing code)
        scansMRM <- ( scanHeaders$msLevel >= 2 & scanHeaders$seqNum > 0
              & !duplicated(scanHeaders$acquisitionNum)
              & scanHeaders$peaksCount > 0) 
        scans <- which(scansMRM)
       
        sipeaks <- rampSIPeaks(rampid, scans, scanHeaders$peaksCount[scans])
       
        # Since multiple Q1 values can be present in a MRM transition table, unique Q1 -values are created which correspond to the MRM transition
        # mz <- Q1 mass (without decimals) then at the decimal place the fragment mass, followed by the collision energy (Just to be sure it is unique).
        # Thus;  mz =  Q1.Q3CE  (I assumed that Q3 is always smaller that 1000 m/z)
        mz <- floor(scanHeaders$precursorMZ[scans])+ floor(sipeaks$mz)/1000 + floor(scanHeaders$collisionEnergy[scans])/1000000
           
        rt <- scanHeaders$retentionTime[scans] 
        datasize <- length(rt)
       
        # we have to determine what the cycletime is. For that I used the mz with the higest peak intensity. Here we can expact a good signal!
        mzmaxI <- factor(mz)[which(sipeaks$intensity==max(sipeaks$intensity))]
        mzmaxscans <- which(mz==mzmaxI) #These scan numbers contain the scans of the MRM-transition with the highest intensity
        rtdiff <- rle(sort(rt[mzmaxscans][2:(length(mzmaxscans))]-rt[mzmaxscans][1:(length(mzmaxscans)-1)])) #The time difference between two successive scans
        CycleT <- rtdiff$values[which(rtdiff$length == max(rtdiff$length))] # CycleT is the most occurring time difference between two equal MRM-transitions 
       
        # CycleT <- quantile(rt[mzmaxscans][2:(length(mzmaxscans))]-rt[mzmaxscans][1:(length(mzmaxscans)-1)])[3]  # gives the same result
 
        rtmzdRT <-data.frame(cbind(
                  rt = floor(rt/CycleT)*CycleT+(CycleT/2),  # Each MRM cycle should be should be performed on a single acquisition time point
                    # consequently the peaks are shifted by maximally CycleT/2 seconds
                    # we have to take this for granted!
                    mz = mz, 
                    int = scanHeaders$basePeakIntensity[scans],               
                    peaksCount=scanHeaders$peaksCount[scans],
                    intensity = sipeaks$intensity,
                    polarity = scanHeaders$polarity[scans],
                    dRT = 1))
        rtmzdRT <- rtmzdRT[order(rtmzdRT$rt,rtmzdRT$mz),]  #sort on rt and mz

        RLErt <-rle(rtmzdRT$rt) # how often a certain rt is occurring 
        scanindex <- as.integer(c(0,which(rtmzdRT$rt[1:datasize-1] != rtmzdRT$rt[2:datasize])))
        tic <- RLErt$values #make a tic table with the same dimensions as ...
        for (i in 1:(length(tic)))
                tic <- sum(rtmzdRT$intensity[(scanindex+1):(scanindex+RLErt$length)])
       
        #end up and return the data
        return(list(rt = RLErt$values,
                    acquisitionNum = 1:length(RLErt$values),
                    tic = tic,
                    scanindex = scanindex,
                    mz = rtmzdRT$mz,
                    peaksCount=as.integer(RLErt$length),
                    intensity =rtmzdRT$intensity,
                    polarity = rtmzdRT$polarity                   
                    ));
    }



## end added code

[attachment deleted by admin]
Title: Re: Analyzing SRM / MRM data
Post by: Ralf on June 27, 2012, 10:20:55 AM
Thank you for sharing your solution!

It should be noted that this patch for ramp.R is only compatible with XCMS versions from the Bioconductor 2.9 branch (1.30.3 (http://http://bioconductor.org/packages/2.9/bioc/html/xcms.html)).
Starting in the BioC 2.10 release, XCMS uses the mzR package for reading raw data.

Ralf
Title: Re: Analyzing SRM / MRM data
Post by: Cole Wunderlich on June 09, 2016, 08:27:41 AM
Thanks!! One of my lab mates is trying to analyze MRM data and I think they will find this very useful.

Cole