Dear all,
I'm writing with an update. I found the edgelists generated by MS-DIAL as part of the GNPS export. I combined all four of the edgelists into a single list. Then I imported the CSV into R and ran the following script using the "igraph" pacakge to group the nodes into network components based on annotation.
library(igraph)
dfr <- read.csv('GnpsEdge_0_20202251139_COMBINED.csv') # combined edge list from GNPS
edgelist <- dfr[,c("ID1", "ID2")]
# I use "as.character" below because if the vector is integers instead of characters,
# the resulting components output includes ALL integers
# from 1 to the max value in edges. Thus, new nodes can be created artificially.
edges <- as.character(as.vector(t(edgelist)))
g1 <- graph(edges, directed=FALSE)
comp1 <- components(g1)
comp.membership <- data.frame(cbind(node=as.numeric(names(comp1$membership))
, annotation.group=as.numeric(comp1$membership)))
The idea I had initially was to select one node (at random or by some other criteria) from each of these annotation-based component groups to represent the entire group. Then in Cytoscape I would filter out all other nodes. However, when I tried this manually, I noticed that some of the components in my molecular network were being split up. Apparently this networking thing is more complex than I thought! :-)
Any thoughts on how to approach this?
For now, I'm moving forward without removing nodes based on annotation and see how it goes. I'll have to deal with this issue at some point down the line, and at least now I know the node groupings based on annotation. I'm curious to see what happens with GNPS IINxFBMN.
Thank you,
Taylan