Great, I thought .RAW format does not support IMS data. We are using VION with the UNIFI software. However, we cannot convert the IMS containing data into .RAW format. Maybe I can try again. Thanks for your tips.

Are you working with Waters IMS data? I am really interested in how to process Water IMS data with MS-DIAL. Do you acquire your data with the UNIFI software?

First, MS-DIAL now supports direct processing of the .RAW files without the need of prior conversion to .abf. Besides, do you mean each file is ~5 GB big or the total size of all your files? In the former case, I would guess that you have profile files? then the solution could be that centroid the file first using for example Masslynx.

    I am guessing that the problem could lie in the deconvolution step. As shown below in the Figure, a tiny peak (45.1 m/z) near the main peak (diphenyl ether) has a bad deconvoluted MS spectrum as it is heavily affected by diphenyl ether. Hence, it has almost the same spectrum as diphenyl ether and of course, it is identified as diphenyl ether with a high matching score. In the end, both the major peak (real diphenyl ether) and the tiny peak (45.1 m/z) will be used for alignment and cause a redundant alignment table.
   As a comparison, the tiny peak (45.1 m/z) seems to have a better-deconvoluted spectrum by AMDIS (not strongly affected by diphenyl ether). For this reason, I am guessing a way to go is to improve the deconvoluted spectrum of the tiny peak. I have tried to apply different sigma values for the deconvolution step, but unfortunately, it did not help.
   I am thinking if it is possible to calculate the quality of every deconvoluted spectrum, then we can apply a threshold to filter out all badly deconvoluted spectra for annotation as they are meaningless.

I now implement a predicted retention index for all compounds as well excluding those that are not predictable, but the speed is still low. I also tried to give 10000 (RI) for all missing values. It was the same. So, I am thinking if the retention time is required as well? Or if I use retention index, so it will not consider retention time?
I think you can export all reference spectra into a single folder from MS-DIAL. Then you can import all these spectra into MS-FIDNER and then export to a *.msp file. I suggest you put positive and negative spectra into two folders so that you can finally get positive and negative libraries separately. Once, you have new reference spectra processed by MS-DIAL, then you can export to the same folder and re-import into MS-FINDER to export the latest *.msp file.

Another thing is that you can use the post-identification function in MS-DIAL. For example, I prepare a *.txt file (see below. please follow the MS-DIAL tutorial) for post-identification, where I put the name, InChIKey, Formula, SMILES, Adduct, and corresponding m/z (it is important as MS-DIAL use it for identification), but leaving RT as -1. currently, I only consider M+H, M+Na, M+K, M+NH4, 2M+H, 2M+Na, 2M+K as well as 2M+NH4 for positive mode. With all this information, you can easily check which feature could be the true one for your reference standard and the exported spectra will contain this meta-information as well. Otherwise, you will have to manually add meta information for each spectra in MS-FINDER.

You can also use LIMA software for the management of you *.msp libraries.

One thing to comment on. I saw that some forks say Lib2NIST cannot convert all spectra at once but in my experience, we can do that. What we need to do is to tick the Use Subset, and to specify all spectra which can be check in the NIST MS Search Program, in the Define Subset. Please see details in the mspcompiler vignettes.