Show Posts - cvdlest

Compound identification / Analyzing SRM / MRM data

June 26, 2012, 08:26:02 AM

In our lab SRM (=MRM) analysis is frequently used to quantify multiple components by LC/MS.

To use SRM data in combination with xcms a number of problems arise, since the data-structure of SRM (mzXML) files is not suitable for the native xcms software.

In each record of a SRM mzXML file the following information is stored.

Some acquisition information
RetentionTime
PrecursorMZ
basePeakMZ (fragment ion), and its intensity

See an example below:
<scan num="13767"
msLevel="2"
peaksCount="1"
polarity="+"
scanType="MRM"
retentionTime="PT828.196S"
lowMz="263.2"
highMz="263.2"
basePeakMz="263.2"
basePeakIntensity="50"
totIonCurrent="49"
collisionEnergy="32" >
<precursorMz precursorIntensity="0" >599.5</precursorMz>
<peaks precision="32"
byteOrder="network"
pairOrder="m/z-int" >Q4OZmkJIAAA=</peaks>
</scan>

The next SRM transition will have its own retention time, precursorMZ and basePeakMZ.
Etc.

Xcms analysis of such data gives rise to three problems:

1. These data files contain no msLevel=”1” data!
2. Xcms analysis is based on the precursorMZ, and there can be multiple SRM transitions with an identical precursorMZ value
3. Each SRM transition is measured at a certain retention time the next a little latter etc. This will produce gaps between two successive specific SRMs.
xcms expects full scans at successive retentions times. eg, if a method contains 20 SRM transitions, and a peak appears in the first SRM transition. The a signal will be detected at that time point, after that 19 time point will generated with no signal.

To overcome these problems I altered the ramp.R script in the xcms source!
The main points are:
1. Determine the cycle-time (CycleT)
CycleT <- rtdiff$values[which(rtdiff$length == max(rtdiff$length))]
# CycleT is the most occurring time difference between two equal MRM-transitions
# CycleT <- quantile(rt[mzmaxscans][2:(length(mzmaxscans))]-rt[mzmaxscans][1:(length(mzmaxscans)-1)])[3] # gives the same result
2. Index the precursorMZ to an unique mz
   mz <- floor(scanHeaders$precursorMZ[scans])+ floor(sipeaks$mz)/1000 + floor(scanHeaders$collisionEnergy[scans])/1000000
# mz <- Q1 mass (without decimals) then at the decimal place the fragment mass, followed by the collision energy (Just to be sure it is unique).
# Thus; mz = Q1.Q3CE (I assumed that Q3 is always smaller that 1000 m/z)
3. Transform the retention times:
rt = floor(rt/CycleT)*CycleT+(CycleT/2)
# Each SRM cycle must be performed on a single acquisition time
# consequently the peaks are shifted by maximally CycleT/2 seconds
# we have to take this for granted!
4. Sort the data by RT and MZ, and create the new correct data-structure
5. return the data

Below the code is listed. I also attached the ramp.R script file.
## begin added code
# for some reason the "MRM" factor ended up in the precursorIntensity column in my version of xcms (it should be in the scantype column) .
# I also added "SRM" but I don't know if this is necessary
if ("MRM" %in% levels(scanHeaders$precursorIntensity) | "MRM" %in% levels(scanHeaders$scanType) | "SRM" %in% levels(scanHeaders$precursorIntensity) | "SRM" %in% levels(scanHeaders$scanType)) {

# I did the same as the original ramp (work around buggy RAMP indexing code)
scansMRM <- ( scanHeaders$msLevel >= 2 & scanHeaders$seqNum > 0
& !duplicated(scanHeaders$acquisitionNum)
& scanHeaders$peaksCount > 0)
scans <- which(scansMRM)

sipeaks <- rampSIPeaks(rampid, scans, scanHeaders$peaksCount[scans])

# Since multiple Q1 values can be present in a MRM transition table, unique Q1 -values are created which correspond to the MRM transition
# mz <- Q1 mass (without decimals) then at the decimal place the fragment mass, followed by the collision energy (Just to be sure it is unique).
# Thus; mz = Q1.Q3CE (I assumed that Q3 is always smaller that 1000 m/z)
mz <- floor(scanHeaders$precursorMZ[scans])+ floor(sipeaks$mz)/1000 + floor(scanHeaders$collisionEnergy[scans])/1000000

rt <- scanHeaders$retentionTime[scans]
datasize <- length(rt)

# we have to determine what the cycletime is. For that I used the mz with the higest peak intensity. Here we can expact a good signal!
mzmaxI <- factor(mz)[which(sipeaks$intensity==max(sipeaks$intensity))]
mzmaxscans <- which(mz==mzmaxI) #These scan numbers contain the scans of the MRM-transition with the highest intensity
rtdiff <- rle(sort(rt[mzmaxscans][2:(length(mzmaxscans))]-rt[mzmaxscans][1:(length(mzmaxscans)-1)])) #The time difference between two successive scans
CycleT <- rtdiff$values[which(rtdiff$length == max(rtdiff$length))] # CycleT is the most occurring time difference between two equal MRM-transitions

# CycleT <- quantile(rt[mzmaxscans][2:(length(mzmaxscans))]-rt[mzmaxscans][1:(length(mzmaxscans)-1)])[3] # gives the same result

rtmzdRT <-data.frame(cbind(
rt = floor(rt/CycleT)*CycleT+(CycleT/2), # Each MRM cycle should be should be performed on a single acquisition time point
                    # consequently the peaks are shifted by maximally CycleT/2 seconds
                    # we have to take this for granted!
mz = mz,
int = scanHeaders$basePeakIntensity[scans],
peaksCount=scanHeaders$peaksCount[scans],
intensity = sipeaks$intensity,
polarity = scanHeaders$polarity[scans],
dRT = 1))
rtmzdRT <- rtmzdRT[order(rtmzdRT$rt,rtmzdRT$mz),] #sort on rt and mz

RLErt <-rle(rtmzdRT$rt) # how often a certain rt is occurring
scanindex <- as.integer(c(0,which(rtmzdRT$rt[1:datasize-1] != rtmzdRT$rt[2:datasize])))
tic <- RLErt$values #make a tic table with the same dimensions as ...
for (i in 1:(length(tic)))
tic <- sum(rtmzdRT$intensity[(scanindex+1):(scanindex+RLErt$length)])

#end up and return the data
return(list(rt = RLErt$values,
acquisitionNum = 1:length(RLErt$values),
tic = tic,
scanindex = scanindex,
mz = rtmzdRT$mz,
peaksCount=as.integer(RLErt$length),
intensity =rtmzdRT$intensity,
polarity = rtmzdRT$polarity
));
}

## end added code

[attachment deleted by admin]

Metabolomics Society Forum

Topics

Topics - cvdlest

Compound identification / Analyzing SRM / MRM data