I wanted to get the ball rolling for resurrecting the discussions concerning MSI. I’ve just had published a paper that may be of interest to anyone involved in standards which can be found at A Metadata description of the data in \"A metabolomic comparison of urinary changes in type 2 diabetes in mouse, rat, and human.\" Griffin JL, Atherton HJ, Steinbeck C, Salek RM. BMC Res Notes. 2011 Jul 29;4(1):272. MetaboLights is making good progress at producing a central repository for metabolomics but we would like some feedback about where we are going and also I’m aware of other initiatives – with a workshop at nih in September and advances by metabolomics Australia. I would be interested in feedback on the paper and also whether a TC might be useful.

Great to hear about the new publication Jules. Fully agree with you that MSI is not dead!

In fact we also very recently published an NMR metabolite library, which is held in a new MSI-compliant database at the University of Birmingham.

For those that are interested, the paper \"Birmingham Metabolite Library: a publicly accessible database of 1-D 1H and 2-D 1H J-resolved NMR spectra of authentic metabolite standards (BML-NMR)\", by Ludwig et al., can be found at:

We based our database on the MSI-endorsed reporting requirements for an NMR metabolomics study:

Importantly, however, there were a number of modifications required to the original proposed reporting requirements. All the changes we made are summarised in a Table in the paper.

An update on my side too.

1. We have MSI-compliant configuration in the ISA tools (;

2. The MetaboLights at EBI is powered by the ISA tools;

3. We have worked to submit the terms collected by the MSI ontology working group (at the time) to OBI, a multi-domains, collaborative project developing an integrated ontology for the description of biological and clinical investigations (

Hi everyone,

I\'m very glad to hear the MSI is still active.

I published a paper describing a central MSI-compliant interactive repository and raw data-processing pipeline for GC-MS metabolomics (MetabolomeExpress; midway through last year [Carroll, A.J., Badger, M.R. and Harvey Millar, A. (2010) The MetabolomeExpress Project: enabling web-based processing, analysis and transparent dissemination of GC/MS metabolomics datasets.; BMC Bioinformatics, 11, 376].

Since publication I have had over 30 users from around the world sign up and set up their own repository to start depositing their data sets in and the database of metabolic phenotypes currently contains ~12,000 public metabolite response statistics from 22 independent experiments representing 16 different peer-reviewed publications and provides a range of query tools including cross-study comparisons and database-driven phenocopy (pattern matching) analysis. I\'m finding I really need to advertise the repository though (rather than relying on people doing google searches for \'metabolomics database\') as it is surprising (and disappointing) how often I hear it said that there is no central repository for metabolomics - that\'s not exactly true because I spent a decent sized chunk of my life building one. Although initially published as a pipeline/repository for GC-MS, I have now adapted the database of processed statistics to accept relative metabolite level statistics from all analytical platforms, albeit without the same level of raw data integration as for GC-MS (this will come) and a paper describing this is on the horizon.

The point of my post here is this: I have a lot of highly-annotated metabolomics data to share with other databases and a bunch of data-mining tools that would benefit from having highly annotated data imported from other databases. Clearly, all database operators have a vested interest in co-operating on the development of a universal open exchange format for raw and processed metabolomics datasets. I, like most others it seems, really like the ISA tools and think that some kind of ISA-based exchange format is the way to go for metadata exchange. Susanna, I see that the ISA tools come with out of the box *capability* for MSI-compliant annotation, particularly at the higher levels. However, unless I\'m missing something (ie. you have a set of metabolomics configurations and ontologies not supplied with the normal isa-tools downloads), I think more work needs to be done in the following areas:

- Standardising the lower-level configurations for specific assay types. For example, instead of just having \"metaboliteprofiling_ms\" as an assay type configuration, it would be better to have a range of more focused (not necessarily more detailed) assay configurations such as \"metaboliteprofiling_1D-GC-EI-TOF-MS\" and \"metaboliteprofiling_1D-LC-ESI-MSMS\"... inside which the range of values is already greatly constrained with more specialised fields.

- Also, there is work to be done in defining / refining the ontologies that are appropriate for each field. For example, in the default \"metaboliteprofiling_ms\" configuration, the field ParameterValue[instrument] points to the Proteomics Standards Initiative Mass Spectrometry Ontology. However, there doesn\'t appear to be a branch in that ontology with a simple list of the major types of instrument used in metabolomics (eg. 1D-GC-EI-TOF-MS, 1D-LC-QTOF-MS etc...). In fact, there isn\'t even a single mention of electron impact ionisation (THE most widely used ionisation technique in MS metabolomics) in that ontology.

- For the \'ontology\' fields, limit the fields to the children of relevant branches of those ontologies. For example, don\'t make the user search/browse through the entire Mass Spectrometry ontology to find the list of detector types when there is a branch called detector types - limit their options to that branch so they can just click on the correct one. All these links between fields and ontology branches should be part of an MSI standard ISA configuration.

Finally, as far as I understand it, the ISA-tab format only defines study design and associated metadata. A standard exchange format for metabolomics needs to define standardised, open file formats for raw and processed data that can be referenced from within the ISA metadata. For GC-MS raw data a logical format would be *.CDF. For LC-MS/MS, mzML. For NMR maybe JCAMP (I don\'t know, I\'m more of a mass-spectrometry person myself). What about processed peak identification results? Data matrices? Relative metabolite levels? Statistical results?

I would very much like to cooperate with other stakeholders in bringing out a *widely accepted* complete exchange format so that complete datasets including metadata, raw and processed data from any of the major instruments or organisms can be transferred between compliant databases in a truly plug-and-play manner. It\'s a fair bit of work, so having a variety of specialists sharing the load would be a great thing!




As part of making MetabolomeExpress I designed a simple yet extensible tab-delimited metadata format based around the old ArMet schema and the recommendations papers of the MSI and built template (validation schema) variants for each of the major model organisms and experiment types including:


Having a validation template for each research area independently allowed me to specify which ontology/vocabulary must be used in each field. For example, gene references in \'mouse_invivo\' or \'mouse_invitro\' must be one of the official mouse gene marker symbols as per the Mouse Genome Informatics (MGI) website (stored in a table on the MetabolomeExpress server). I\'ve attached the \'mouse_invivo\' validation template as an example. The codes define the range of valid field values at each field in the format and are all explained in the Appendix of the MetabolomeExpress manual.

Metabomeeting is coming up at the end of the month. Is there anything we want feedback on for the work around MSI? I\'m happy to promote Adam\'s work and ISA-TAB but what would we like feedback on?

Chris Taylor from the EBI has been in touch and he\'s produced a reference tool for omic data in terms of the standards in reporting requirements. You can find it at:

I\'ve just taken a look and what strikes me is how much more detail we\'ve included in CIMR (MSI in old acronyms) compared with MIAME and how its going to be impossible to expect anyone to go through the whole CIMR list. I think we (the metabolomic community) need to work on that rather urgently. I like the mass spectrometry description - that\'s really useful to have in mind if we\'re going to produce something joined up across approaches. Any thoughts?

Jules - I concur.  The CIMR list is intimidating, particularly for those of us not in Academia.  The CIMR list appears to be an idealized list which is fine, but we should also designate a subset as the minimum essential (required?) list - which should be much shorter.

