This got several views so I got around to doing this and here is my solution.
an@derivativeIons is a list with each item corresponding to a peak from the xcmsSet peaktable. It contains a rule-based annotation for each peak as an adduct, NL, etc. Each of these annotations links an ion in a dataset a (Eg. M+H) to a corresponding neutral mass (M). I was hoping to retrieve these relationships, specifically the M+H and M+Na that both predict the same mass (M), rather than the isolated peak's annotation as found in an@derivativeIons.
In case anyone else wants to do this, here is a simple search for these relationships.
First we collect all the annotations and their predicted neutral masses:
an = xsAnnotate
all_masses = do.call("rbind", lapply(1:length(an@pspectra), function(x) {
neutral_masses = do.call("rbind", lapply(an@pspectra[[x]], function(y) {
do.call("rbind", lapply(an@derivativeIons[[y]], function(z) {
cbind(neutral = z$mass, rule = z$rule_id, peaknum=y, psg = x)
}))
}))
}))
This looks like:
> head(all_masses)
neutral rule peaknum psg
mass 401.3136 7 234 1
mass 670.6027 1 236 1
mass 401.3136 1 248 1
mass 670.6027 37 259 2
mass 498.3887 12 274 2
mass 451.3561 7 281 3
This is then easy to search however you like. To find all the M-H and M+Cl's that predict the same parent mass (and are in the same psg).
withinppm = function(m1, m2, ppm=20) { abs(m1-m2)/m1*1E6 < 20 }
hcl_pairs = do.call("rbind",lapply(which(all_masses[,"rule"] == 1), function(x) {
mh = all_masses[x,,drop=F]
a_pair = all_masses[
all_masses[,"psg"] == mh[,"psg"] &
withinppm(all_masses[,"neutral"], mh[,"neutral"]) &
all_masses[,"rule"] == 7,
,drop=F]
if(nrow(a_pair) < 1) {return(NULL)}
cbind(neutral=mh[,"neutral"],psg = mh[,"psg"], mh_peaknum = mh[,"peaknum"], mcl_peaknum = a_pair[,"peaknum"])
}))
This looks like:
> head(hcl_pairs)
neutral psg mh_peaknum mcl_peaknum
401.3136 1 248 234
882.6437 3 9228 9193
856.5924 4 1946 1982
886.5545 4 1981 1984
713.4113 7 9483 9397
443.3272 8 3144 2666
Disclaimer: This isn't an optimized way to search, but its functional and fast enough on my laptop. I'm also not an R guru so suggestions are welcome.
Nate