Skip to main content
Topic: InChIKey Generation (Read 149 times) previous topic - next topic

InChIKey Generation

Developers,

There seems to be a bug in the InChiKey generation. I'm getting a few InChiKeys that don't seem to match their corresponding metabolite. At least, they pull up 0 results on Pubchem. It looks like they should have a S instead of an N in the third position from the end. I listed a few examples below.

Name                             MS-DIAL Generated InChIKey                    Correct(?) InChIKey
3-Hydroxyvaleric acid   REKYPYSUBKSCAT-UHFFFAOYNA-N          REKYPYSUBKSCAT-UHFFFAOYSA-N
Leucine                         ROHFNLRQFUQHCH-UHFFFAOYNA-N      ROHFNLRQFUQHCH-UHFFFAOYSA-N
Pyroglutamic acid        ODHCTXKNWHHXJC-UHFFFAOYNA-N      ODHCTXKNWHHXJC-UHFFFAOYSA-N
Threonine                     AYFVYJQAPQTCCC-UHFFFAOYNA-N          AYFVYJQAPQTCCC-UHFFFAOYSA-N

Could this be easily patched, or is there a workaround? I guess I could use the SMILES instead of the InChIKeys.
Thanks!

 

Re: InChIKey Generation

Reply #1
I also have a same problem - my workaround is to use just the first layer of the InChIKey plus the UHFFFAOYSA-N. Separation of enantiomers (second layer of the InChIKey) is anyway rare on classic LC-MS.

@Hiroshi Tsugawa
I am not sure MS-DIAL/MS-FINDER is generating InChIKeys by itself. It seems there is some library in MS-DIAL which contains UHFFFAOYNA-N ending. Perhaps changing UHFFFAOYNA-N to UHFFFAOYSA-N would resolve the confusion?