Metabolomics Society Forum

Software => MS-DIAL => Topic started by: Zhang.6517 on December 30, 2019, 08:18:24 PM

Title: Questions about lipidomics database setup
Post by: Zhang.6517 on December 30, 2019, 08:18:24 PM
Hi I want to ask a few questions relating to database for MSDIAL. For the first one, what is internal database coverage for lipidomics? Or which database is internally set to use for lipidomics processing?
Secondly, what can we do if we want to enrich the database we search against? We have downloaded some MSP files from the website but we can't figure out how to load it into MSDIAL.
Title: Re: Questions about lipidomics database setup
Post by: biswapriya on December 31, 2019, 08:53:48 AM
Hi Zhang,

I can try to answer the 2 questions:

[1] If you got to the MS-DIAL page: http://prime.psc.riken.jp/Metabolomics_Software/MS-DIAL/ you can find information on the internal lipids DB and mentioned as follows:

...
LipidBlast fork (Last edited in December.26, 2019)
Currently, MS-DIAL internally has in silico MS/MS spectra for lipid identifications. Below is the LipidBlast (fork) templates that MS-DIAL partially uses.
Binary file (lbm2) for MS-DIAL lipidomics (adjusted Oliver Fiehn lab LC-MS method): Download
Binary file (lbm2) for MS-DIAL lipidomics (adjusted Makoto Arita lab LC-MS method): Download
Binary file (lbm2) for MS-DIAL lipidomics (adjusted Kazuki Saito lab LC-MS method): Download
LipidBlast template for glycerolipids.
LipidBlast template for sphingolipids.
These libraries are also available as MSP format: Positive (32 class, 110,833 molecules, 143,342 spectra) and Negative (48 class, 154,770 molecules, 342,454 spectra).
The original LipidBlast is available from here.
The nomenclature for lipid classes in MS-DIAL lipidomics is shown at 'Lipid nomenclature in MS-DIAL lipidomics'.
...
[2] To enrich or add more .msp files/ spectrum, Hiroshi also taught this to me as to how to do it using  a .txt file; and this I have put in as a protocol.io (method)  file freely accessible and you can follow steps and successfully generate your own library (by adding these new .msps onto the existing spectral libraries/ DBs) : https://www.protocols.io/view/steps-for-building-an-open-source-ei-ms-mass-spect-8txhwpn

Let me know if this is what you were looking for!

Please correct me, Hiroshi if anything is incorrect or amiss.

Thanks again,
Biswa
Title: Re: Questions about lipidomics database setup
Post by: Hiroshi Tsugawa on January 01, 2020, 07:59:09 PM
Hi Zhang and Biswa,

thanks Biswa, yes, it's true for 'metabolomics' project of MS-DIAL, and adding the user-defined spectra in lipidomics are a bit tricky.

MS-DIAL lipidomics project currently utilizes the LBM2 binary format which serializes the ASCII MSP format
(1) to reduce the file size for rapid retrieving the library file (if the file is formatted by ASCII, the current file size exceeds 1GB actually), and
(2) to keep the contents private till publications.

However, MS-DIAL itself can also import the previous "LBM" ASCII format file in lipidomics project.
I uploaded the publicly available LBM spectral information in the following section that Biswa introduced.
http://prime.psc.riken.jp/Metabolomics_Software/MS-DIAL/index.html

LipidBlast fork (Last edited in January 2th, 2020)
ASCII file (lbm) for MS-DIAL lipidomics (ajusted Oliver Fiehn lab LC-MS method): Download

Please note that the LBM file does not contain the full information of LBM2, the content information of public-LBM is mostly "one third" of LBM2. Sorry, but this is actually important to keep our priority in the lipidomics research field. (I will open the full spectra after I publish our latest paper.)
BTW, the content of LBM2 is summarized in http://prime.psc.riken.jp/Metabolomics_Software/MS-DIAL/index5.html.
*Note that not all of lipid subclasses described in the above page can be utilized in MS-DIAL. I will open them when my latest paper is opened.

I summarize the process for how to modify the LBM format file and how to use it in the lipidomics project.
1. please exclude the original LBM2 format from the MS-DIAL folder. (The program accepts only one LBM/LBM2 file in MS-DIAL folder.)
2. open the LBM file by a text editor like wordpad/notepad++ (but the public-LBM file has already been very big, the file may not be opened by notepad++. In such case, you have to use UNIX command etc..
3. Please paste your own spectral records to the LBM file with the "COMPOUNDCLASS: Others" field.
*If you put the "Others" term in the CompoundClass field, it can automatically import all of your queries. It means that you can simply add the MSP-formated ASCII records in the lbm file. For the records containing "COMPOUNDCLASS: Others", the classical spectral matching algorithm using dot-product, reverse-dot product, and fragment presence percentage is used for the annotation although the lipidomics project basically uses a hybrid scoring method using such the classical similarity algorithm and a decision tree algorithm to represent the suitable lipid structure information based on the characteristic fragment existences.
4. Please save it. Then, you can use your customized spectral records in the lipidomics project.

BTW, you can check the contents of the public-LBM file by UNIX command like:
For checking lipid subclasses included:  grep "COMPOUNDCLASS: " **.lbm | sort | uniq
For checking all lipid species included: grep "NAME: " **.lbm

Thanks,

Hiroshi