Skip to main content

Show Posts

This section allows you to view all Show Posts made by this member. Note that you can only see Show Posts made in areas you currently have access to.

Messages - chandlerjd

1
XCMS Online / Setting mzwid for high res data
I am noticing some irregularities in my EICs from XCMS Online using the Orbitrap II HPLC setting for LTQ-Velos data. There are several noisy "peaks" (just baseline noise) as well as "half peaks" (integration stops near the apex of a peak). 
I looked through the method settings I wasn't really sure where to troubleshoot. I did notice that mzwid is set to 0.015, and this is 75 ppm error at 200 m/z and 15 ppm error at 1000 m/z -- well outside the range of the Orbitrap's resolution. So I turned this setting down to 0.005, and also based on the observed chromatography I adjusted the max peak width to 50 sec. 

I'm wondering if A) I have misinterpreted what mzwid is doing and B) if there are other features the community can suggest to improve peak integration. I can live with noise peaks as those are (relatively) easy to ID and remove from the final feature table. However the "half peaks" are not as easy to spot without close EIC inspection and any help getting rid of them would be appreciated. 
2
XCMS / Re: Peak filling - an example of strange results
Thanks for your reply!
I would start by troubleshooting the group step. It can be either the m/z or the rt dimension that goes wrong. If the peaks have long tails or your parameters are bad the peaks could get cut up. Or you could have small peaks you don't see without zooming specifically.
It would be easier if you posted parameters for your processing.

Here are the parameters I used, any insight about anything I could be doing wrong would be helpful:
xset<-xcmsSet(method="centWave", ppm=2.5, peakwidth=c(10,60), snthr=10, mzdiff=-0.001, noise=1000, prefilter=c(3,5000))
xset<-group(xset, method="density", bw=5, mzwid=0.015, max=100, minfrac=1)
xset2<-retcor(xset, method="obiwarp", plottype="deviation")
xset2<-group(xset2, method="density", bw=5, mzwid=0.015, max=100, minfrac=0.5)
xset3<-fillpeaks(object=xset2, method="chrom")

Another problem with orbitrap data is that you can get so-called shoulder peaks or satellite peaks in the m/z dimension. If you are affected by that (look at the mass peak and zoom at low intensity around the peak. Do you see some very small noise-like mass peaks around the real peak?) you can use xcmsRaw.orbifilter from https://github.com/stanstrup/chemhelper to filter the raw data before analysis. But I don't think it is that. that usually creates a lot of peaks in the table with slightly different masses.

I really don't see anything I would deem a peak. Is Xcalibur adequate to look for this? Why is Orbi data prone to the shoulder peaks compared to other types of MS?

It looks strange that you say there is a peak at 140 s and the closest it finds is 130 sec...

I misspoke. It's really more like 133 sec in the sample file I looked at. So 130 is reasonable.

I am not sure what output table you are talking about. What function generates it? If you use the one that calculates statistics I think it is ordered by p-value (not sure, never use it).

I don't generate the stat info either. After xset3 is made, I generate a text file from the peakTable function with xset3.

fTab<-peakTable(object=xset3)
write.table(fTab, name, sep="\t", row.names=FALSE)
3
XCMS / Peak filling - an example of strange results
I have posted on this topic before, but hoping to push the conversation ahead by framing the subject a little differently.

I'm extracting lung tissue data from C18 with positive ESI on an Orbitrap. I often look at methionine's behavior, including how XCMS or similar software integrates it, compared to Thermo's Xcalibur software. This is because methionine is usually a "model metabolite" on our platform.

For this particular run, there are two "methionine" peaks that are 2-3 ppm off of the monoisotopic mass (plus the proton). One elutes sharply at 60 sec, the other a bit more broadly at 140 sec. Now for the interesting part: XCMS integrated this as 4 peaks. In my output table, they go in this order:

150.0588mz_130sec (npeaks=48 in 48 samples)
150.0588mz_88sec (npeaks=54 in 48 samples)
150.0587mz_106sec (npeaks=42 in 42 samples)
150.0587mz_66sec (npeaks=37 in 37 samples)

So first of all, I'm not sure why they go in this order in the column which is overall an ascending m/z column. Is it ordered by SNR or another metric? It can't be by m/z value, as break the overall ascending order trend in the column.

Second, which of these features is really the peaks of interest?

Finally, what am I doing wrong when I use XCMS to get twice as many peaks? Right now I am troubleshooting whether it is my setting of mzdiff=-0.001 (I am not sure where I came up with that value in the first place), and trying out -0.00005 instead.

Any help is greatly appreciated!
4
XCMS / Re: Getting peaks filled in that don't seem to be there...
It has been a while since I have replied (sorry), but I would tend to agree with you, Jan, that a higher SNR does prune some of these false features out. But this is a very fine line for discovery metabolomics. However, setting minfrac around 0.5 or so seems to kill off a large chunk of the false features. Strangely, some will still survive. I am not sure what other parameters to play with in hoping to address this. As I have time in the near future, and now that the forum is back up, I will post my full script with a model data set showing the phenomenon and see if we can make progress.

It may be relevant to point out that I really don't know the best way to determine what mzdiff ought to be on a high resolution instrument such as the HF QExactive. If mzdiff=-0.001, that's allowing two features to overlap by 10 ppm (@ 100.0000 expected m/z) and still be detected... Which seems quite liberal, but as the parameter doesn't scale with m/z (i.e., it is absolute with the units), larger masses will be detected more conservatively. Sure enough, I see *most* of the false features in the low range (85-200 m/z). Does this suggest mzdiff may be the culprit? I have not had time to test it yet, but I will try to spend a bit of time on it later and report back...
5
Sample preparation / Re: Protecting -SH groups
I realize this topic is old, but the only way to preserve the redox state of the biological material for MS is to trap the reduced thiols with an alkylating agent. Different groups have employed NEM or IAM for this while the sample is still aqueous, or after recovering the sample from acidified conditions (low pH will lead to protonation for the overwhelming majority thiol groups, especially free glutathione and Cys). This procedure would modify all accessible thiol groups at relatively similar rates, varying predominantly based on the sulfur pKa (it gets more complicated if you care about protein Cys -- denaturation is crucial for non-selective labeling).

I have not seen a method to do this sort of procedure built into a larger untargeted platform, but multiple methods that utilize LC-MS quantification of multiple glutathione and/or Cys species after thiol alkylation have been published within the last decade.

One consideration with any alkylating reagent will be side reactions (applies moreso to IAM) and modifications of the adducted species (applies to mostly to NEM). Some have also observed that IAM will degrade to iodine in storage leading to artifactual reduction in some cases.

Lastly, I will add that I see clear evidence of artifactual oxidation of glutathione in acetonitrile preparations, but it is nowhere near as complete as it would be in aqueous conditions under the same extract handling.
6
XCMS / Getting peaks filled in that don't seem to be there...
I have had a frustrating issue with XCMS for a while now. Rather than describe this I've attached a PPTX file that shows first several peaks of the same m/z which have been listed, and in the second slide there is a trace of the 10 ppm window of that m/z. As you can see there are one or maybe two peaks in the trace but XCMS gave me several more from out of the noise.

How to fix this? I have HF QExactive data, positive mode, Hilic HPLC. I used centWave and obiwarp.

ppm=2.5
SNR=10
peakwidth=c(10,60)
noise=5000
prefilter=c(3,5000)
mzdiff=-0.001
bw=5

[attachment deleted by admin]
7
XCMS / Re: Extracting blank samples in a discovery batch
A colleague pointed out that many instruments accumulate ions before injection, so blanks may not be quantitatively viable (as blanks with low total ion abundance will accumulate more signal from the background noise). Maybe this means efforts to include them in the xcms extraction are foolhardy (at least, for said instruments). Has anyone here used blanks in discovery experiments?
8
Other / Re: Library construction
We do something like this internally in the lab, basically by hand with a shared set of spreadsheets.

I have also found that database MS-MS spectra at 0 V are great for predicting good matches that have yet to be documented.

Automating this process for large numbers of metabolites is not something I would know how to do.
9
XCMS / Re: Extracting blank samples in a discovery batch
Jan, thanks for your reply. First let me apologize for my naivete as I am far more familiar with multi-parameter apLCMS and xMSanalyzer. I am coming to like the performance of xcms quite a lot but still trying to figure out the nooks and crannies.

Quote from: "Jan Stanstrup"
The answer depends on the settings for the group function (?group.density). The interesting parameters are minfrac and minsamp. Remember that they are set "per group". So it depends how you have grouped your data (sample group, not feature group... Usually you have grouped your samples/files by putting it in different sub-folders if not all in one group/folder.)

I did not appreciate what group.density is doing in xcms. Let me see if I understand how to use these. (For the record, I have never worked with data by grouping into sub-folders. I usually extract an entire directory into one matrix, just like .CEL files into an Affymetrix Robust Multi-Array. If this is not the standard approach with xcms, I may need additional education.)

If minfrac is defined as 'minimum fraction of samples necessary in at least one of the sample groups for it to be a valid group', I suppose you would want to set the minfrac = # samples within smallest group / # of total samples. So if you had 5 cases, 5 controls and 2 water blanks, would you ignore the water blank and set minfrac to 5/12 (round down to 0.40), or include the blanks as a group and use 1/12 (round down to 0.15)? I am assuming all files are in the same sub-folder and extracted together. Or are you suggesting instead to extract samples per condition-group, so you'd extract the 5 cases, 5 controls and 2 blanks as 3 data matrices. But then how to combine them...?

As for minsamp, 'minimum number of samples necessary in at least one of the sample groups for it to be a valid group', is the purpose of this to allow xcms to define groups? In my above example, would I use 2/12 or 5/12?
10
XCMS / Best credential feature recovery from LC-MS with QExactive
I'm curious if anyone using an LC-coupled QExactive workflow has a set of XCMS parameters they are happy with for recovering features that could ultimately be credentialed (to borrow terminology from Mahieu et al*). This paper has published some optimized parameters for XCMS, but I wonder if they need to be adjusted for the QExactive and if so, how (lowering ppm window significantly is one guess). I also have to bear in mind that my experiment won't involve isotopic credentialing, so the optimization in the paper may not be appropriate for a less initially rigorous discovery platform. I am much more used to working with apLCMS than XCMS, so I apologize if I am treading on old territory with this question.

*Credentialing Features: A Platform to Benchmark and Optimize Untargeted Metabolomic Methods, NG Mahieu et al, Anal Chem 2014
11
XCMS / Extracting blank samples in a discovery batch
Assume you run a study with plasma samples and include a water sample as a blank for a discovery metabolomics experiment. Will that blank influence the m/z features detected? I have noticed that XCMS rarely outputs zeros in the resulting data matrix, and wonder if this is because it is integrating noise in an m/z window for missing values or because it is ignoring features with missing values. In other words, I am asking what will happen if the plasma samples have a peak that passes SNR threshold and other criteria, but the water blank doesn't have this peak. Will the feature be ignored or will XCMS try to integrate the water sample's missing value?
12
XCMS / Re: truncated m/z in mummichog files
Is the data you collected really accurate to 7 decimal places for m/z? I believe 4 is the current standard for the highest resolution instruments (someone correct me if I'm wrong). Everything in mummichog's main reference database, metafish network, is set at 4 decimal places for m/z as well.