Profile data handling by MS-DIAL (orbi) June 04, 2020, 12:32:45 AM Working on annotation methodology, i've stumbled upon an issue with data handling. For instance, we have some big guys doing some good metabolomics (10.1021/acs.analchem.8b04698). The data in the abovementioned work were acquired on a good instrument with a good resolution and mass accuracy (i slightly doubt it was actually stable 0.1mDa, quite optimistic -). In centroid mode. Without any justification. Or i was unable to trace the explanation back to previous works of the group, this can also be an issue.It is supposed, that centroid data should be equal in quality to those collected in profile mode. The only reason to use centroids should be data volume reduction. However, processing the data with MS-DIAL we can see and perfectly replicate following: profile data produce less features than centroided, software employed is MS-DIAL or ProgenesisQI. Workflow is standard (for MS-DIAL):Acquisition -> MSConvert if prof. -> ABF converter -> MS-DIAL (full scan tolerance 0.3mDa) -> ... -> Data matrix for statisticsprof. / cent. prof.->cent. Sample: human plasma PP, identification/annotation via in-house DB)There is some small difference if we produce centroids on the fly with the instument or with Thermo MSFileReader library. They adjust the algorithms slightly with each update and new RawFileReader API was introduced recently. But its negligible. Big issue is, that we get 280 features in profile mode against 350 using centroids. Manual curation reduces the numbers to 220/250. The difference is still >10%. So, the devil should be somewhere in details. I assume, ABF converter simply extracts MS data array from .RAW files using Thermo API. If we are in profile mode, it should be simply datapoints against the scans. No problem here. Then, MS-DIAL performs centroiding on its own. The big question is, how MS-DIAL does the job: is it the same noise estimation + slicing algorithm used in chromatogram EIC extraction or something else?Otherwise centroiding picks up all the shoulders and distorted peaks, creating too much garbage, which reduces S/N ratio, as it stated in Progenesis (with a recommendation to use profile data). I cannot be clear on Progenesis, what happens there, as it is proprietary black box. But more or less the same thing is observed also there.P.S. i prefer profile data for different, but not completely unrelated, reasons (FTMS research).