1
Genome scale network analysis / Re: Avoiding connection on side compounds
I just saw your answer, had problems with my mailbox :/
Thank you so much for replying!
I wish I have seen this paper before finishing my phd thesis...
I totally agree on the first two methods, , the pros and the cons, but I still don't totally understand the usage of chemical similarity between compounds on a large scale.
I'm totally missing something in the structure similarity comparison to identify side compounds... In the example you put in your paper, (Figure 3) there is a comparison of structural similarity between compounds involved in the glucokinase reaction:
GLC + ATP -> GLC-6-P + ADP + H+
Similarity scores (as from the paper):
GLC - GLC-6-P = 0.85
GLC-ADP = 0.22
But, if a do an automated analysis, I will perform similarity comparison between all compounds involved in the reaction, which will give me these additional scores:
ATP - GLC-6-P = 0.30
ATP - ADP = 0.90
So that will mean that GLC is transformed in GLC-6-P and ATP in ADP.
But how I can use this information (in a global metabolic network, where I don't have predefined pathways, or target metabolites) to say that I want to select only the connection between GLC and GLC-6-P and not the connection between ATP and ADP?
The similarity between ATP and ADP is higher that between GLC and GLC-6-P, so weighting on this can lead to huge mistakes, no?
(By the way, always in this paper, there is a sentence I really don't get "As shown in Figure 3A, atom mapping shows that no atoms are exchanged between glucose and ATP during the glucokinase reaction". But... They exchange the whole phosphate group, no?)
I am more and more thinking about AI : a human with even small knowledge of biochemistry is able to recognize "main" and "secondary" compounds for most of the reactions. Do you think it's worth to create such an AI that could learn from human expertise? Or you think it's an excessively complicated way to solve this problem?