Show Posts - benjie

This section allows you to view all Messages made by this member. Note that you can only see Messages made in areas you currently have access to.

Messages - benjie

XCMS / Re: Targeted (and consistent) peak detection with XCMS

May 05, 2014, 01:23:20 PM

Thanks for the pointer. I didn't know about the recent github move. That's awesome. Will make a fork and a pull request.

XCMS / Targeted (and consistent) peak detection with XCMS

May 05, 2014, 12:17:21 PM

Hello,

We have been using XCMS to automate our high throughput analysis pipeline. We use multiple columns in various settings in the lab, and consistent peak detection using centWave has proved to be very challenging. Most of our datasets contain real peaks with very different widths, and centWave often finds partial or co-joined peaks. Furthermore, when signals are weak, centWave throws away perfectly good peaks. Peak quantification using standard curves is impossible to achieve w/o consistent, reliable peaks.

We have therefore made some changes to XCMS centWave, to enable targeted peak detection. With these changes, we can now define targets manually, where a target is a Mz range, a Rt range, and a peak width range. Targets are translated to ROIs, and we run slightly modified centWave algorithm on these manually constructed ROIs. In practice, this has produced peak sets matching targeted peak detection results from vendor software. However, unlike XCMS, vendor software cannot be automated and is hard to configure when number of targets becomes very large.

Our updates to XCMS, along with more detailed descriptions of what changed, are now in this github repository: https://github.com/benjiec/xcms

We welcome feedback, hope these changes may prove to be useful to some of you, and are willing to help incorporating appropriate changes to the main XCMS source tree.

Thanks,

Benjie

XCMS / Re: centWave peak detection and integration question

November 21, 2013, 10:47:19 AM

Hi Ralf,

Sure thing.

The attachment here shows the original peak described in my email. In this case, centWave examined CWT results with increasing scale values. The RT centers from all the scales were all around 209. centWave does fixed-width integration to find the best scale, using the peak width of the lowest scale as the integration window. For this peak, this favored the lowest scale, which corresponds to the spike after the plateau. Descending after scale selection stopped in the middle of the plateau. On the other hand, if selection of scale was done using the best coefficient from CWT, as the original CWT paper suggests, then the scale matching the entire peak from 200 to 220 is selected.

Using fixed-width integration to find best scale was also inconsistent. For some peaks, the best fixed-width integration values came from integrating around the center of the smallest scale, then around the center of the largest scale; it all depends on where the center of the scale is. CWT coefficients, I believe and please correct me if I am wrong, should report how well a fit with a particular scale matches to the model wavelet; this seems to be a good measure of how good looking the peak is.

Note that the peak was from a clean standard run.

Thanks,

Benjie

[attachment deleted by admin]

XCMS / Re: centWave peak detection and integration question

November 13, 2013, 01:55:55 PM

Some corrections to my terminology: after CWT, centWave has a list of peaks and tries to find the best wavelet *scale* for each peak. The scale describes the shape of the wavelet fitting the signal; the highest CWT coefficient for the scale is the peak center, and scale*2 is the peak width in number of scan intervals (which can be converted to seconds). To find the best scale, centWave does a fixed-window integration around each scale's peak center, and chooses the scale with the best fixed-window integration value. The fixed window size corresponds to the minimum peak width, but in units of scan intervals.

This approach seems erratic. In my example peak, the peak center is 209.79 at the smallest scale, moving to 208.72 as scale increases, then moves back up to 209.5 at the largest scale. These peak centers all describe the same spike in the overall peak, but the minor shifts in center changes the fixed window integration value, and favors either the first or the last scale, corresponding to the narrowest or the fattest peak shape.

I propose the following changes.

1. Keep on using fixed window integration to find the scale that gives the peak center.
2. Use the scale with the highest CWT coefficient to obtain the peak width -- the original CWT peak integration paper cited by the centWave paper (http://bioinformatics.oxfordjournals.or ... 9.full.pdf) suggests using this approach to pick the best scale.

See https://github.com/benjiec/xcms/commit/ ... 3a6c1e12b3

Looking forward to your feedback!

Thanks,

Benjie

XCMS / centWave peak detection and integration question

November 12, 2013, 02:54:37 PM

Hello,

I've been using centWave for peak detection for awhile now. A recent careful look at how it picks peaks revealed the following artifact. I was hoping someone with more knowledge of centWave can give some more input/thoughts.

It looks like for each ridge detected from CWT, the R code (xcmsRaw.R) tries to find the scale/peak width that describes the peak the best. Here we have an array of scale/peak widths, and the center of a hypothetical peak corresponding for each scale/peak width. For example, suppose I have min PW=4, and max PW=30, I'd have an array of RTs, each is the center of a hypothetical peak with varying peak width between 4 and 30. Then, to see which hypothetical peak is the best, the code computes the area around each center using a FIXED window whose size is the min peak width. So in my example that'd be a window of 4. The hypothetical peak with the best fixed-window area wins. The peak width corresponding to the winning hypothetical peak is used to construct the mexican hat model peak for integration.

I ran into a problem with this approach. I have a peak with a leading plateau, then a sharp spike, then descending down to baseline. The plateaus is around 4 seconds, and the spike and ramp down is longer, around 8 seconds to the maxima then down to baseline. The above algorithm finds the ridge at the spike, but the winning hypothetical peak is the one with peak width = 4 seconds. So in this case, integration is done using a model peak with width being 4 s, around the spike, and most of the plateau was absent from the integration.

Using the fixed window to find the local maxima seems to defeat the purpose of having a range of peak width values to detect peaks with. There are many cases when the highest fixed-window area correspond to the smallest peak width/scale, even though a larger peak width/scale best describes the entire peak. Is there a better way to pick the best peak width at each ridge?

Thanks.

Benjie

XCMS / matchedFilter terminates too early on an EIC?

September 20, 2012, 09:28:44 PM

matchedFilter finds peaks in a EIC, with a max that is a parameter of the filter function. However, it looks like as soon as it hits one peak that fails the S/N threshold, it stops looking for more peaks in that EIC. This is in line 729 of xcmsRaw.R. Should it continue to look for other peaks, ignoring the current one? I've got a case where the highest one has low s/n, but there are other peaks in the EIC, at different RTs.

Thanks.

XCMS / Difference between intf, into, maxf, and maxo

September 20, 2012, 10:22:53 AM

Hi

How exact is intf and into computed? Similarly for maxf and maxo?

If intf the area after binning? And into before binning?

Also, what does it mean when intf and maxf are "NA"?

Thanks.
Benjie