Skip to main content
Topic: Implementing custom retention time alignment algorithms (Read 2779 times) previous topic - next topic

Implementing custom retention time alignment algorithms

Hi Everyone,

We've been working on some retention time alignment algorithms, recently. The major driver for this, is that we have noticed that compounds from different classes can exhibit unique retention time drift behavior. So even though they initially elute at very close retention times, one class of compounds will begin to elute later in the chromatogram while the other class elutes earlier.

The issue we are facing, is how can we best apply these adjusted retention times to an XCMSnExp objects and still maintain all the capabilities of XCMS?

Currently, we can force the adjusted retention times into an XCMSnExp object by manually calling adjustedRtime. I know this isn't recommended practice, but is there an alternative? We don't want to use applyAdjustedRtime as we want to go back and integrate missed peaks.

For the most part, this appears to work well, except when we run fillChromPeaks. We get the following error:
Code: [Select]
> xdata<-fillChromPeaks(xdata,BPPARAM=SerialParam())
Defining peak areas for filling-in ....Error in if (idx_rt_adj > idx_pk_det) { : argument is of length zero

This appears to be produced when dropChromPeaks is called (fillChromPeaks->filterFile->chromPeaks->dropChromPeaks). Essentially, dropChromPeaks looks in processHistory to determine whether peak detection occurred before or after retention time alignment. But the retention time alignment isn't in processHistory.

I couldn't quite see how we can add entries into the processHistory. Eventually I resorted to fudging an entry to circumvent the error.
Code: [Select]
processHolder[[1]]@type<-"Retention time correction"
Is there a simpler way to accomplish this?

Would anybody care to offer some advice/suggestions for any part of this? I'm happy to have anyone's input.

Re: Implementing custom retention time alignment algorithms

Reply #1
This looks like a nasty bug in xcms. I will have a look into it, as you should not get this error when doing fillChromPeaks. Also, applyAdjustedRtime is a valid approach. But this should not affect fillChromPeaks.

I have opened an issue at the xcms github repo (

Could you just provide the info which version of R/xcms you are using?

Re: Implementing custom retention time alignment algorithms

Reply #2
Hi Johannes,
Just to be clear, I am using 'adjustedRtime' to apply the adjusted retention times (and not following it with 'applyAdjustedRtime').
Code: [Select]

If I use applyAdjustedRtime after adjustedRtime, I do not get an error with fillChromPeaks. This is likely because hasAdjustedRtime returns FALSE, so the processHistory check never gets performed (methods-XCMSnExp.R#L651, I think)
My concern with using applyAdjustedRtime, is that the data will be slightly warped and so the integration during fillChromPeaks will be slightly off. That is, unless it using the retention time when loading the raw data again? In which case the whole thing is solved  :))

Nonetheless, I am using 'R Under development (unstable) (2019-01-14 r75992)', 'xcms_3.5.1' and 'MSnbase_2.9.3'.
I compiled this version of xcms from the github page ("sneumann/xcms") to utilize the subset feature, so I'm not sure if that version number above is necessarily correct.


Re: Implementing custom retention time alignment algorithms

Reply #3
Thanks for clarifying!

Regarding your concern, the fillChromPeaks will always use the adjusted retention times if retention time adjustment has been performed. So, the results should be the same, with or without applyAdjustedRtime.

Regarding the error: I fixed that. You can install the updated version from github:

For the developmental version you are using:
Code: [Select]

For the Bioconductor 3.8 release (R-3.5.x)
Code: [Select]
devtools::install_github("sneumann/xcms", ref = "RELEASE_3_8")

Re: Implementing custom retention time alignment algorithms

Reply #4
Thanks for looking into and fixing the error - very much appreciated.

I was quite intrigued that you said fillChromPeaks always uses the adjusted retention time, so I looked a bit deeper into the code.
It seems fairly simple to allow the ability to integrate using the original rt range.

If you included another parameter on getChromPeakData to select whether switch back to the unadjusted rtrange, stored the unadjusted rtime, figured out which index of rtim is the rtmin and rtmax, then use those indexes to get the original rt for use in the calculation of res[,"into"]
Code: [Select]
.getChromPeakData <- function(object, peakArea, sample_idx,
                             mzCenterFun = "weighted.mean",
                             cn = c("mz", "rt", "into", "maxo", "sample"),
                             unadjusted=FALSE) {
rtim <- rtime(object)
if(unadjusted) rtim_adjusted <- rtime(object,adjusted=!unadjusted)
rtScans<-range(which(rtim >= rtr[1] & rtim <= rtr[2]))
if(unadjusted) rtrange<-rtim_unadjusted[rtScans] else rtrange<-rtim[rtScans]
res[i, "into"] <- sum(mtx[, 3], na.rm = TRUE) *
          ((rtrange[2] - rtrange[1]) /
             max(1, (sum(rtim >= rtr[1] & rtim <= rtr[2]) - 1)))

However, this highlighted something else in the code that felt odd to me (again, I'm making a lot of assumptions).
By using 'rtr[2] - rtr[1]' in the calculation of "into", don't we always end up overestimating the area of the peak?
rtr comes from the medians of other samples, but getChromPeakData integrates using scans found between these limits. So the rt range of where it integrates is notionally smaller than 'rtr[2] - rtr[1]'. In the example above, rtrange is indeed smaller (with unadjusted=FALSE).

Could we iterate over peakArea and calculate new rtmin and rtmax based on the actual rtime?
Code: [Select]
peakArea<-apply(peakArea,1,function(pk) {
    # Get start and end index of rtim between rt range
    rtScans<-range(which(rtim >= pk["rtmin"] & rtim <= pk["rtmax"]))
    # Convert median rt range to actual rt range
    # If the user wants unadjusted rt range give it to them, otherwise just rt range
    if(unadjusted) rtrange<-rtim_unadjusted[rtScans] else rtrange<-rtim[rtScans]
    # Save rt range in peakArea, so it can be used instead of rtr[2]-rtr[1] for res[i,'into']

I'm not sure how centWave integrates peaks and how the rtmin and rtmax are chosen. So maybe this doesn't make sense...



Re: Implementing custom retention time alignment algorithms

Reply #5
Thanks for sharing your ideas. If you would like some changes in xcms I would however appreciate if you open an issue at the xcms github repository as this enables me also to keep track of what to do (and what was done) - a pull request would actually be even better ;)

Regarding the code in xcms centWave - I did not write that code and I am veeerrry hesitant to change anything within it as this will affect all xcms users.

thanks again, jo

Re: Implementing custom retention time alignment algorithms

Reply #6
Thanks Johannes,

I'll take a look at making the changes and generating a pull request (never done one before).

No problem regarding xcms centWave. I don't think I'd be brave enough to suggest changes there.