Hi everyone,
Does anybody know how to convert NIST library into *.MSP file together with retention index information? it will be of great help to use that *.MSP library for GC-MS features annotation in MS-DIAL. I have tried to use lib2nist for this purpose, however, there is no retention index when applying MS-DIAL. I am not sure if the problem is the *.MSP file or MS-DIAL cannot recognize it. We have a library named W10N14, I am not sure what is it, but I guess it is a combination of Wiley and NIST library. When I convert this library into *.MSP file, MS-DIAL can recognize retention index information. The problem is that W10N14.MSP uses both real and estimated retention index. for high accuracy annotation, I think using the specified retention index (selected column type) like NIST MS search does is good. However, how to prepare a library with retention index information (specified column type) is a problem. I am thinking if it is possible to modify the *.MSP file by retrieving the specified column type retention index from NIST website, but unfortunately, I am know how to program.
any suggestion will be appreciated!
Thanks a lot.
best regards,
Sukis
Hi Sukis,
MS-DIAL will recognize the following field names for retention index information in MSP.
RETENTIONINDEX: **
Retention_index: **
RI: **
(Ignore case)
Unfortunately, (probably), lib2nist will convert nist_ri library to msp with the retention index field like
Synon: UN 1661 (Related)
, and the information will not be retrieved in MS-DIAL program. And as you mentioned, nist_ri and several information of nist14/17 will have several retention index information from several column types.
Personally, I made an RI prediction model for predicting the retention index for metabolites.
https://pubs.acs.org/doi/abs/10.1021/acs.analchem.7b01010
Because I was using MeOX-TMS derivatizations in GC-MS, I've made a "reactor" to convert the compound structure to the derivatized form.
http://prime.psc.riken.jp/Metabolomics_Software/MetaboloDerivatizer/index.html
I am using PaDEL or CDK to generate the compound descriptors, and recently, xgboost, random forest, ksvm etc can easily be used to make a model for RI predictions in R etc.
I hope this information is helpful for you.
Hiroshi
Thank you very much Hiroshi, I will take a look.