What is a highly-confident annotation?

November 26, 2022, 06:41:38 PM

Hello all,

I got some annotations with MS-DIAL but I'm not sure which ones I could have high confidence in.

I searched the literature and found that:
One paper said they confidently identified metabolites with a dot product >0.8 (Guo and Huan, 2020).
One paper said a dot product score >~500 seems sufficient to retain high confidence in metabolite identification (Barbier Saint Hilaire et al., 2020).
Another paper said total identification score >70% means an annotation with high confidence (Wasito et al., 2022).

I'm really confused about which score I should look at, and which value should be the threshold for a confident annotation. I was wondering if there are any official/accepted criteria for that.

Thank you in advance for your information!

Re: What is a highly-confident annotation?

Reply #1 – November 28, 2022, 09:36:15 PM

Hi XiaoqingW,

Excellent question, but it takes a more foolish person like me to take a stab at it.

I would say, all 3 citations are good to go, and NONE are helpful. These are good findings.

Unfortunately, I have not come across any such criteria recommended by Metabolomics Society, or do not think it will be ever possible.

Throwing a few thoughts back to you:

1. If the reference spectrum has only 2 fragments, and the query spectrum has 12 of which the 2 overlap, would you take it as a "confident match"? it could be MS instrument level differences.

2.If two spectra are generated using different instruments, then the spectra might be different enough to show a 500 DP score, and would be acceptable, right? Same compound, same RT but different spectra due to different mass analyzers.

3. What if the older library spectra came from a low resolution instrument and scanned from 100-600 m/z and you are comparing it with a query spectra obtained from a newer high resolution instrument with fragments obtained from a 30-800 scan range: same compound, same MS1, different fragments and search space ??

4. If you are dealing with LC-MS/MS data where Rt, and RI are non-existent and non-interoperable, its very murky out there to just have "ANY Cut-off" to get "any confidence" based on MS1, and MS2 alone- esp. with different set ups! LC-MS/MS community NOT leveraging RT is quite painful and unhelpful in the longer run.

I feel the convention of 500, 50% cosine/ DP, and > 0.5 cosine/ DP are all comparable. MS-DIAL's total score is a weighted DP, if I recall but their FAQ explains some of it.

We definitely need inputs from more experts!

Thanks,
Biswa