Skip to main content

Topics

This section allows you to view all Topics made by this member. Note that you can only see Topics made in areas you currently have access to.

Topics - Sergey Girel

1
MS-DIAL / Retention times for lipids in Fiehn articles and MS-DIAL
Hi all,

should RT´s for lipids in MSDIAL-TandemMassSpectralAtlas-VS68-Pos.msp (latest version) correspond to those published in https://doi.org/10.1021/acs.analchem.7b03404 ?

Fiehn article states RT 2.23 min for LPC 18:0, which i also similarly detect with his method. MSP file above, however, has RT of 6.75.

If this is correct, so that MSP retention times should be like this, then two questions:

1) Lipidomics processing in MS-DIAL 2.xx, does it have Fiehn RTs?
2) Derivation of RTs in VS68 - measurement or in silico?

Thx in advance for clues!
2
MS-DIAL / Post-identification issues 2
Hello, guys!

I've noticed some weird thing with post ident. What i want to have, is an ability to identify all possible adducts for my metabolite library. So, the postIdent .txt file has a separate entry for each adduct for each metabolite. Like this:

Code: [Select]
18-Hydroxycortisol	379.21152	6.22	[M+H]+		C21H30O6
18-Hydroxycortisol 401.19346 6.22 [M+Na]+ C21H30O6
18-Hydroxycortisol 417.1674 6.22 [M+K]+ C21H30O6
18-Hydroxycortisol 396.23807 6.22 [M+NH4]+ C21H30O6
18-Hydroxycortisol 361.20204 6.22 [M+H-H2O]+ C21H30O6
18-Hydroxycortisol 348.19148 6.22 [M+H-2H2O]+ C21H30O6
18-Hydroxycortisol 383.1828967 6.22 [M+Na-H2O]+ C21H30O6

MS-DIAL output - attached. To all these masses the adduct [M+H]+ is attributed wrongly.

How to overcome?

Thx in advance!
SG


 


3
MS-DIAL / Processing of big datasets
Hello, community!

I have almost managed to push a set of 385 Q-ToF measurements (around 200 Gb of centroid data :-) through MS-DIAL v. 4.24. It has for now finished gap filling (required ca. 90h) and does something in limbo for another 24h. App is responding, something is being written on HDD from time to time and there is also some dynamic in RAM. But quite small one. Only one-two cores of 16 are involved at 100% load from time to time. Don't know when it'll be done, but already excited to see the results.

In this run one of our in-house databases had been used for metabolite identification. However, we got another idea and would like to check with another database. Is there a possibility to go through another post-ident run on existing ion table without experiencing a whole sequence of peak picking / alignment / gap-filling / finalization again?

IIRC MS-DIAL always does this when new alignment instance is created. And it would take another week again, i guess, if there is no way around it. Another important thing also arises - let's say i suddenly got power shortage during processing in gap filling stage. Is there some possibility to start from the point where everything stopped?
4
MS-DIAL / Profile data handling by MS-DIAL (orbi)
Working on annotation methodology, i've stumbled upon an issue with data handling. For instance, we have some big guys doing some good metabolomics (10.1021/acs.analchem.8b04698). The data in the abovementioned work were acquired on a good instrument with a good resolution and mass accuracy (i slightly doubt it was actually stable 0.1mDa, quite optimistic -). In centroid mode. Without any justification. Or i was unable to trace the explanation back to previous works of the group, this can also be an issue.

It is supposed, that centroid data should be equal in quality to those collected in profile mode. The only reason to use centroids should be data volume reduction. However, processing the data with MS-DIAL we can see and perfectly replicate  following: profile data produce less features than centroided, software employed is MS-DIAL or ProgenesisQI.

Workflow is standard (for MS-DIAL):

Acquisition -> MSConvert if prof. -> ABF converter -> MS-DIAL (full scan tolerance 0.3mDa) -> ... -> Data matrix for statistics
prof. / cent.         prof.->cent.

Sample: human plasma PP, identification/annotation via in-house DB)

There is some small difference if we produce centroids on the fly with the instument or with Thermo MSFileReader library. They adjust the algorithms slightly with each update and new RawFileReader API was introduced recently. But its negligible.

Big issue is, that we get 280 features in profile mode against 350 using centroids. Manual curation reduces the numbers to 220/250. The difference is still >10%. So, the devil should be somewhere in details. I assume, ABF converter simply extracts MS data array from .RAW files using Thermo API. If we are in profile mode, it should be simply datapoints against the scans. No problem here. Then, MS-DIAL performs centroiding on its own. 

The big question is, how MS-DIAL does the job: is it the same noise estimation + slicing algorithm used in chromatogram EIC extraction or something else?

Otherwise centroiding picks up all the shoulders and distorted peaks, creating too much garbage, which reduces S/N ratio, as it stated in Progenesis (with a recommendation to use profile data). I cannot be clear on Progenesis, what happens there, as it is proprietary black box. But more or less the same thing is observed also there.

P.S. i prefer profile data for different, but not completely unrelated, reasons (FTMS research).
5
MS-DIAL / Post-identification issues
Hi, Hiroshi! Great thx for the MS-DIAL and all the work you guys do on it!

Have a couple of questions regarding post-identification with tab-separated file:

1) Does current version of MS-DIAL have limits on a size of the postidentification tab-separated text file? We are doing metabolomics on endogenous metabolites. In-house database is employed at some point, which contains around 800 entries for tailored identification process. When i try to load the complete list, i get a window "Loading libraries..." after pressing "Finish" button, which dissapears momentarily and nothing happens. I've reduced the list to around 300 entries, everything went well.

2) For the post-identification library, can the Formula/InChi/ be included there to get displayed in Basic Peak Property/Compound detail tabs?

3) As far as I understand, for postident you need to put masses of ionized species, right? So, the post-identification is only possible then for one peak with selected adduct -> can be overcome by having 5-6 rows with different probable adducts for single compound -> again, size limit problem.

Will it be possible in future to use neutral masses in post-ident, that the program searches through all combinations based on selected adducts?

Cheers and take care,
SG