Skip to main content
Topic: centWave peak detection and integration question (Read 5102 times) previous topic - next topic

centWave peak detection and integration question

Hello,

I've been using centWave for peak detection for awhile now. A recent careful look at how it picks peaks revealed the following artifact. I was hoping someone with more knowledge of centWave can give some more input/thoughts.

It looks like for each ridge detected from CWT, the R code (xcmsRaw.R) tries to find the scale/peak width that describes the peak the best. Here we have an array of scale/peak widths, and the center of a hypothetical peak corresponding for each scale/peak width. For example, suppose I have min PW=4, and max PW=30, I'd have an array of RTs, each is the center of a hypothetical peak with varying peak width between 4 and 30. Then, to see which hypothetical peak is the best, the code computes the area around each center using a FIXED window whose size is the min peak width. So in my example that'd be a window of 4. The hypothetical peak with the best fixed-window area wins. The peak width corresponding to the winning hypothetical peak is used to construct the mexican hat model peak for integration.

I ran into a problem with this approach. I have a peak with a leading plateau, then a sharp spike, then descending down to baseline. The plateaus is around 4 seconds, and the spike and ramp down is longer, around 8 seconds to the maxima then down to baseline. The above algorithm finds the ridge at the spike, but the winning hypothetical peak is the one with peak width = 4 seconds. So in this case, integration is done using a model peak with width being 4 s, around the spike, and most of the plateau was absent from the integration.

Using the fixed window to find the local maxima seems to defeat the purpose of having a range of peak width values to detect peaks with. There are many cases when the highest fixed-window area correspond to the smallest peak width/scale, even though a larger peak width/scale best describes the entire peak. Is there a better way to pick the best peak width at each ridge?

Thanks.

Benjie

Re: centWave peak detection and integration question

Reply #1
Some corrections to my terminology: after CWT, centWave has a list of peaks and tries to find the best wavelet *scale* for each peak. The scale describes the shape of the wavelet fitting the signal; the highest CWT coefficient for the scale is the peak center, and scale*2 is the peak width in number of scan intervals (which can be converted to seconds). To find the best scale, centWave does a fixed-window integration around each scale's peak center, and chooses the scale with the best fixed-window integration value. The fixed window size corresponds to the minimum peak width, but in units of scan intervals.

This approach seems erratic. In my example peak, the peak center is 209.79 at the smallest scale, moving to 208.72 as scale increases, then moves back up to 209.5 at the largest scale. These peak centers all describe the same spike in the overall peak, but the minor shifts in center changes the fixed window integration value, and favors either the first or the last scale, corresponding to the narrowest or the fattest peak shape.

I propose the following changes.

1. Keep on using fixed window integration to find the scale that gives the peak center.
2. Use the scale with the highest CWT coefficient to obtain the peak width -- the original CWT peak integration paper cited by the centWave paper (http://bioinformatics.oxfordjournals.or ... 9.full.pdf) suggests using this approach to pick the best scale.

See https://github.com/benjiec/xcms/commit/ ... 3a6c1e12b3

Looking forward to your feedback!

Thanks,

Benjie

Re: centWave peak detection and integration question

Reply #2
Benjie,

can you post a plot or screenshot of these peaks ?

Ralf

Re: centWave peak detection and integration question

Reply #3
Hi Ralf,

Sure thing.

The attachment here shows the original peak described in my email. In this case, centWave examined CWT results with increasing scale values. The RT centers from all the scales were all around 209. centWave does fixed-width integration to find the best scale, using the peak width of the lowest scale as the integration window. For this peak, this favored the lowest scale, which corresponds to the spike after the plateau. Descending after scale selection stopped in the middle of the plateau. On the other hand, if selection of scale was done using the best coefficient from CWT, as the original CWT paper suggests, then the scale matching the entire peak from 200 to 220 is selected.

Using fixed-width integration to find best scale was also inconsistent. For some peaks, the best fixed-width integration values came from integrating around the center of the smallest scale, then around the center of the largest scale; it all depends on where the center of the scale is. CWT coefficients, I believe and please correct me if I am wrong, should report how well a fit with a particular scale matches to the model wavelet; this seems to be a good measure of how good looking the peak is.

Note that the peak was from a clean standard run.

Thanks,

Benjie

[attachment deleted by admin]