Show Posts - conley.c

This section allows you to view all Messages made by this member. Note that you can only see Messages made in areas you currently have access to.

Messages - conley.c

XCMS / Re: Cross MS-Platform Quantitation with XCMS, HOW?

July 15, 2011, 12:52:55 PM

Here is an elaboration of what we have been considering.

Our confusion arises mainly because we've seen a different way of integrating the same data: (to our understanding) XCMS performs a time correction for scan rate and then simply sums up the peaks, while XCalibur does "connect the dot" integration.

Our working model is that when Xcalibur records a centroid, that centroid represents the total number of ions that would have been flowing into the mass spectrometer since the last scan. We realize that the LTQ or Orbitrap don't actually collect ions the entire time between scans because they need time to scan them, but the number reported in a .RAW file corrects for this. If a centroid actually represents this model, then summing the ions up in an XIC seems to us to be the most accurate way to quantitate a peak.

We drew up some simple examples that demonstrate that changes in scan rate change the quantitation derived from these 3 ways of quantitating data.

XCMS / Cross MS-Platform Quantitation with XCMS, HOW?

July 15, 2011, 11:16:40 AM

Hello Again,

centWave handles data from a Bruker MicrOTOF-Q instrument and also from Thermo LTQ -Orbitrap data.
We are really impressed with its feature detection performance.

Is the acquisition of spectra information different for those machines and how does centWave compensate for that?

For example, we believe that Thermo uses an instantaneous model of aquisition, where there is biological sample missed
between the centroids and they account for that by just integrating "a connect the centroids model". See the red area figure I made up.
On the other hand there could be a different acquisition mode, right? We believe its also possible to have an "open door" acquisition mode,
where each centroid represents the cumulative sum of ions acquired between the last scan and the current scan (blue area in figure).

And is centWave's normalization of scan rate to compensate for inter or intra sample scan rate variability? Please see recent post under
"area under curve" of a guess at how you might do that normalization on an intra sample case.

XCMS / Re: How does centWave compute area under curve?

July 12, 2011, 01:54:22 PM

Got busy with other things

When you say it is normalized by the scan rate, then I would guess you are doing something like the following to adjust for variability in scan rate of any given feature.
>scan_rate_avg = difference of all adjacent scans and divide by n scans.
>scale_factor = (scan(i) - scan(i - 1)) / scan_rate_avg;
>new_spectrum_intensity = spectrum_intensity(i) / scale_factor; #inverse relation

EXAMPLE:

Suppose the average scan rate is 1 scan per second. Then if the current scan time minus previous scan time (t(i) - t(i - 1)) is 0.5 or 2 scans per second, the normalization might scale the intensity by a factor of 2 because its time to accumulate packet ions from the Orbitrap was reduced by half.

>scale_factor = (1/2) / 1
>new_spectrum_intensity = spectrum_intensity(i) / scale_factor; # equal to spectrum_intensity * 2

We recently queried THERMO's tech support for the LTQ-Orbitrap platform (have not heard back yet) to know how they report intensity. Do you have any insider edge on this?

Appreciate the expertise

Chris

XCMS / How does centWave compute area under curve?

June 30, 2011, 06:06:03 PM

Good evening,

I would like to evaluate how much area under the curve centWave actually obtains on a manually curated set. I have the raw data points corresponding to each feature detected, but when I sum the intensity values of the centroids it generally underestimates the intensity that is reported in the "into" and "intb" columns of the matrix returned by findPeaks.centWave(). Why are the intensities different? I read the following part of the documentation for findPeaks.centWave():

integrate: Integration method. If ‘=1’ peak limits are found through
descent on the mexican hat filtered data, if ‘=2’ the descent
is done on the real data. Method 2 is very accurate but prone
to noise, while method 1 is more robust to noise but less
exact.

I suppose that the mexican hat operation or the real descent alter the way centWave quantitates.
I could not find an answer in the published paper.
In other words, why doesn't it operate like this?

area under feature = int(1) + int(2) + ... + int(n)

Thanks for your time,

Chris

XCMS / Re: access centroids from getEIC

June 23, 2011, 07:29:17 PM

Interesting and a very good point. I will consult my PI about that. Right now I am just trying to capture the evaluation from as many angles as possible. For example, your 7/10 cross-sample validation in the centWave paper was a nice creative way of generating a "ground truth".

XCMS / Re: access centroids from getEIC

June 23, 2011, 06:02:58 PM

Thanks, I am implementing that right now

Got some wierd bug

Quote from: "Ralf"

why you want to compare or evaluate the raw data for each feature region instead of using the feature coordinates (mz, rt) = the feature center point.

Fair question,

You know that is not a bad way of determining feature identities; I hadn't thought of that.
With a manual annotation of a feature's associated centroids,
we can better know what fraction of a feature does an algorithm find. This way you can generate
metrics of specificity because non-feature belonging centroids are noise and thereby a true negative.
In the absence of centroid-based criteria, we would rely simply on precision and recall like your paper
since it is difficult to categorize a true negative on a feature basis. What is a non-feature (if you see
what I mean)? Both types of metrics have their own advantages and disadvantages of characterizing performance.

XCMS / Re: oh, parameters

June 23, 2011, 01:03:57 PM

Hello,

I am new as well, but offer some help.

Quote from: "luoluopig"

:? , bw argument in retention time correction, what bw stands for.

Did you read this part in the documentation of xcmsPreProcess.pdf? It is in the doc folder of where the xcms
library was installed in your version of R.

"After retention time correction, the initial peak grouping becomes invalid and is
discarded. Therefore, the resulting object needs to be regrouped. Here, we decrease the
inclusiveness of the grouping using the bw argument (default 30 seconds)."

Quote from: "luoluopig"

>reporttab[1:4, ], what this 1:4 is?

More from the same document.

"A report showing the most statistically significant differences in analyte intensities can be
generated with the diffreport method. It will automatically generate extracted ion chro-
matograms for a given number of them."

The [1:4, ] is R syntax for operating (in this case printing) on the rows 1 through 4 of a matrix and the blank after the comma is for operating on all the columns of the matrix.

Quote from: "luoluopig"

Second, if no error message, does that mean the results obtained from XCMS is relaible;

As for reliability, do no use the matchedFilter algorithm when you can use the centWave feature finder. It is specified under the parameter ...

>xcmsSet( ..., method = 'centWave', ...) and you should try to optimize the parameters.

Best of luck,
--Chris

XCMS / Re: access centroids from getEIC

June 23, 2011, 10:41:57 AM

Hi Ralf,

Thanks, I should be more precise. My goal is to evaluate centWave's performance on a manually annotated LC-MS centroided data set.
In order to do that, I need the values associated for each corresponding feature that centWave determines. That
way I can generate quantitative scores like X% sensitivity and Y%Sensitivity or find out the fraction of correct centroids
identified for any given feature.

I would like something like this if possible:

>(object or matrix) = function for feature finding with centwave()

>And some getter function returns some data structure returning the following.

feature(1): { (rt, m/z) (1), ..., (rt, m/z) (k) }
.
.
.
feature(n): { (rt, m/z) (1), ..., (rt, m/z) (k) }

What has not worked.

(1)
>findPeaks.centwave() => Returns a list of features (m X 10 matrix) which is nice, but the list of features only corresponds to summary statistics.

mz mzmin mzmax rt rtmin rtmax ....
.
.
.

(2) (a)
>xcmsRaw(file) => Returns an xcmsRaw object and cannot call feature finding like an xcmsSet object.
(b)
>plotRaw(file) => Returns a (m X 3) column of (rt, mz, int) of the whole data set, which is nice.
I would like something like this, but with respect to features not the whole data set.
However, this function only seems to accept xcmsRaw objects, which cannot find features.

(3)
> xcmsSet( ..., method = 'centWave', ...) => returns an xcmsSet object with feature finding functionality.
However, documentation only shows it returning EICs for visualization.

Appreciate the time,

--
Chris

XCMS / access centroids from getEIC

June 21, 2011, 07:58:43 PM

Hello,

I would like access to any peak's associated centroid (rt, m/z, int) triplet pairs. I have searched both forums and pdf documentation looking for how to use the function getEIC(). However, its been difficult to unlock the information within the object besides just plotting it. I don't necessarily need the plot. I want something like the following:

eic(1): { (rt, m/z, int) (1), ..., (rt, m/z, int) (n) }
.
.
.
eic(n): { (rt, m/z, int) (1), ..., (rt, m/z, int) (n) }

That would allow me to compare its performance to other algorithms on a centroid basis. I would prefer to avoid using the findPeaks.centWave() method that returns (among other things)

eic(1): { mzmin, mzmax, rtmin, rtmax }.

I could use this to get all points that fall in this region, but that sometimes will return more than one centroid per scan that would violate the ROI algorithm.

Thoughts?
--Chris