Skip to main content
Topic: Correct settings for GNPS/FBMN from MSe data (Read 1180 times) previous topic - next topic

Correct settings for GNPS/FBMN from MSe data

Hello Community,

I was very excited to find out yesterday that MS-DIAL can be used to process MSe data for feature based molecular networking on GNPS. However, I am a little confused on how to set the correct data processing parameters. I have read through the MS-DIAL tutorial (https://mtbinfo-team.github.io/mtbinfo.github.io/MS-DIAL/tutorial), especially chapter 8. I am also using this paper as a guide: https://doi.org/10.1016/j.foodchem.2019.05.099

First of all, here is what I'm working with:
--Water's .RAW files, converted to mzML or ABF
--Acquired using Waters Xevo G2 QTOF in MSe mode
--ESI (negative ionization)
--Function 1: low collision energy (6V); centroided
--Function 2: high collision energy ramp (20-50V); centroided
--Function 3: lockmass
--mass range: 100-1500 Da (both low and high collision energy functions)
--MS-DIAL v4.12

I'm the most unclear about which "MS method type" to use and how to set up the Experiment file. According to the tutorial (https://mtbinfo-team.github.io/mtbinfo.github.io/MS-DIAL/tutorial#section-8-1), I should be using ‘All-ions with multiple CEs’ and set up the experiment file something like:
ID   MS Type   Start m/z   End m/z   Name   CollisionEnergy   DecTarget(1:Yes, 0:No)
0   SCAN   100   1500   MS1   6   1
1   ALL   100   1500   MS2   20   1
2   ALL   100   1500   MS2   50   1

However, this does not quite make sense, because the 20V and 50V modes are just the two bounds of the ramp. The entire ramp is collected as a single data stream (function 2), not as two separate streams.

An earlier section of the tutorial (https://mtbinfo-team.github.io/mtbinfo.github.io/MS-DIAL/tutorial#section-1-4) also shows "MSE" as an option for MS Type, which makes more sense to me. However, this example only has the first 4 columns of the experiment file: (i.e. ID,   MS Type,   Start m/z,   End m/z). This would suggest that the MS method type should be set to "SWATH-MS or conventional All-ions method".

I am also unclear on how to set up the "DecTarget" part of the experiment file. My chromatograms have many closely-eluting peaks, so I think I need to do deconvolution. Would I need to set DecTarget = 1 for all of the lines in the experiment file? Or just the line corresponding to the low energy channel (i.e. function 1).

In summary, my questions are:
1) How should I set up the experiment file correctly? Does each line in this file correspond to a data channel (e.g. function 1)?
2) Which "MS method type" should I select?
3) Given what I've described about my data and goals, are there any other data processing parameters that I should pay particular attention to? How about data file conversion parameters?

Thank you for all of your help. I'm very excited by the prospect of being able to do FBMN with our old MSe data!

Taylan


Re: Correct settings for GNPS/FBMN from MSe data

Reply #1
Dear Taylan,

as long as I understand your data, please follow the below setting.
1. Capture.PNG shows the example startup project window for your data.
Note: SWATH-MS or conventional all-ions method is fine for your data actually. (although in fact you may use both methods)
All-ions with multiple-CEs setting is actually made for the method incorporating three or more collision energies for all-ions like
0   SCAN   100   1500   MS1   6   1
1   ALL   100   1500   MS2   20   1
2   ALL   100   1500   MS2   50   1
3   ALL   100   1500   MS2   70   1
However, your data is obtained by two CEs setting. So the experimental file should contain the following information (as attached as experimental_file.txt).

ID   MS type   Start mz   End mz
0   SCAN   100   1500
1   MSE   100   1500

The other columns like Name and CollisionEnergy are not used in "SWATH-MS or conventional all-ions" method option.

2. Use abf file format for your analysis actually. (I personally have never checked the process by mzML...)
3. I recommend to optimize (A) minimum peak height of peak detection tab (B) MS/MS abundance cut off of MS2Dec tab for your analysis to rapidly process your data. Especially check the baseline of noise signals. According to my experience for synapt g2 mse, the baseline of noise signal is around 10^5 - 10^6, and therefore, I used around 10^5 value for (A) and (B) parameters. If you use the default setting (1000 and zero for A and B), the processing time requires too much hours because it should also handle the noisy spectra.

Please let me know your update.
Thanks,

Hiroshi

Re: Correct settings for GNPS/FBMN from MSe data

Reply #2
Dear Dr. Tsugawa,

Thank you for your detailed response.

Using the base peak ion chromatograms from blank injections, it looks like the baseline for function 1 (low energy) is around 3,500 and for function 2 (high energy ramp) is around 1,500. So I am thinking about setting A = 5,000 and B = 3,000. Does this sound like a reasonable approach for determining cut-offs?

Concerning the ABF file converter, do you recommend selecting any of the options for my data? Please see attached screenshots.

Thank you,
Taylan

Re: Correct settings for GNPS/FBMN from MSe data

Reply #3
>>So I am thinking about setting A = 5,000 and B = 3,000. Does this sound like a reasonable approach for determining cut-offs?

These settings should be fantastic.


>>Concerning the ABF file converter, do you recommend selecting any of the options for my data?
You do not have to change the setting. Please use the default setting for MSE data.:D

Hiroshi

Re: Correct settings for GNPS/FBMN from MSe data

Reply #4
Dear Dr. Tsugawa,

Thank you again. Using these settings, I was able to convert and process ~400 files in under 4 hours.

I used MS-DIAL to export the data to GNPS, and I ran FBMN. However, it appears that most of the clusters are composed of MS1 features with very similar retention times (see attached figure, nodes colored by retention time). This makes me think that
1) many of the MS1 features correspond to the same compound AND/OR
2) peaks are not being aligned correctly such that the same feature in different samples is being identified as different features.

My hunch is that #1 is the main factor. If I were using XCMS for pre-processing, I would solve this problem by running CAMERA to deconvolute the features.  Is there a way to do this in MS-DIAL? There is an "MS2Dec" tab in Analysis parameter settings, but I think what I need is to deconvolute the MS1 features. Is there a way to do this? I'm seeing the CorrDec option, but all of the settings appear to be for MS2.

Also, in analysis parameters setting, I selected the option "remove features based on blank information" along with "keep removable features and assign the tag". However, I'm not seeing any obvious columns/tags in the exported gnps table. How can I remove features that were tagged based on blank information?

Many, many thanks,
Taylan

Re: Correct settings for GNPS/FBMN from MSe data

Reply #5
Hi Taylan

in GNPS export option, to exclude the blank sample features from the data matrix, you should not tick the option of "Keep removable features and assign the tag" of Alignment tab. By this, blank features will be excluded from the result of peak alignment.
For CAMERA-like function, MS-DIAL has the following four annotation procedures as the post curation of peak detection and alignment result to provide the peak characters.
See also: https://www.nature.com/articles/s41592-019-0358-2
1. Chromatogram-based annotation: peaks having similar chromatographic peak shapes in the same RT area are grouped.
2. Adduct annotation: adduct ions from a metabolite are grouped.
3. MS/MS based annotation: The precursor ion observed as a product ion of the higher m/z's MS/MS spectrum is tagged as a tentative candidate of ion source fragment ion of a precursor ion.
4. Alignment-based annotation:  the correlation of ion abundances of two alignment spots was calculated for all precursor ion pairs found in ±0.02 min. The spots were assigned the term ‘highly correlated ions’ when the correlation coefficient was greater than 0.9.

Such annotations are also exported as the edge files of GNPS export, and they can be used in GNPS IINxFBMN platform, which will be supported in very near future. Meanwhile, could you please check these edge files to reduce the nodes which should have same MS/MS features?
Also, well, MS-DIAL MS2Dec cannot provide a good MS/MS spectral separation (deconvolution) if the peak tops are completely same. In such case, CorrDec function should be useful to provide the decovoluted spectra using the ion abundance correlations across biological samples. Ipputa Tada who is the responsible person will put more information on this chat.

All the best,

Hiroshi



Re: Correct settings for GNPS/FBMN from MSe data

Reply #6
Dear Taylan,

I am the main developer of Correlation-based Deconvolution (CorrDec).
CorrDec is an MS2 deconvolution method using the intensity correlations between MS1 and MS2 among samples to obtain clean MS2 spectra.
User can run CorrDec after data processing, please see the following topic:
http://www.metabolomics-forum.com/index.php?topic=1406.msg4146#msg4146

MS-DIAL can show CorrDec spectra and export them to MS-FINDER or as MSP format.
However, currently, CorrDec cannot deconvolute MS1 features, and CorrDec spectra cannot be exported to GNPS.
If you are interested in CorrDec, please try with default parameters.

Best,
Ipputa

Re: Correct settings for GNPS/FBMN from MSe data

Reply #7
Dear Dr. Tsugawa and Dr. Tada,

Thank you for your responses. I switched gears to some other projects. When I return back to this one, I will post an update.

Best wishes,
Taylan Morcol

Re: Correct settings for GNPS/FBMN from MSe data

Reply #8
Dear all,

I'm writing with an update. I found the edgelists generated by MS-DIAL as part of the GNPS export. I combined all four of the edgelists into a single list. Then I imported the CSV into R and ran the following script using the "igraph" pacakge to group the nodes into network components based on annotation.

Code: [Select]
library(igraph)
dfr <- read.csv('GnpsEdge_0_20202251139_COMBINED.csv') # combined edge list from GNPS
edgelist <- dfr[,c("ID1", "ID2")]
# I use "as.character" below because if the vector is integers instead of characters,
# the resulting components output includes ALL integers
# from 1 to the max value in edges. Thus, new nodes can be created artificially. 
edges <- as.character(as.vector(t(edgelist)))
g1 <- graph(edges, directed=FALSE)
comp1 <- components(g1)
comp.membership <- data.frame(cbind(node=as.numeric(names(comp1$membership))
                                  , annotation.group=as.numeric(comp1$membership)))

The idea I had initially was to select one node (at random or by some other criteria) from each of these annotation-based component groups to represent the entire group. Then in Cytoscape I would filter out all other nodes. However, when I tried this manually, I noticed that some of the components in my molecular network were being split up. Apparently this networking thing is more complex than I thought! :-)

Any thoughts on how to approach this?

For now, I'm moving forward without removing nodes based on annotation and see how it goes. I'll have to deal with this issue at some point down the line, and at least now I know the node groupings based on annotation. I'm curious to see what happens with GNPS IINxFBMN.

Thank you,
Taylan

Re: Correct settings for GNPS/FBMN from MSe data

Reply #9
Hi Taylan:

I think you should not integrate all the edge files simultaneously. Also, I do not recommend to use the following edge file from MSE data.
3. MS/MS based annotation: The precursor ion observed as a product ion of the higher m/z's MS/MS spectrum is tagged as a tentative candidate of ion source fragment ion of a precursor ion.

I think, to make it simple, I recommend to integrate following two edge files.
1. Chromatogram-based annotation: peaks having similar chromatographic peak shapes in the same RT area are grouped.
2. Adduct annotation: adduct ions from a metabolite are grouped.

GNPS IINxFBMN group also imports one of these files separately.
Thanks,

Hiroshi