I have worked quite extensively with XML itself in R, actually on mzML files and not on mzXML files, but the principle is the same.
It is a very versatile method to get a lot of additional data out.
A code snippet I used, this one reads out the instrument configuration sections from a mzML file (the mzXML files are simpler, in general. Just open it as a text file and you can easily orient yourself in the structure)
package(XML)
openXML.mzML <- function(filename)
{
mzml <- xmlTreeParse(filename, asText=F, useInternalNodes=T,
fullNamespaceInfo=T,useDotNames=T)
return(mzml)
}
getConfigs.mzML <- function(mzml)
{
instrumentConfigs <- getNodeSet(mzml,
"/m:indexedmzML/m:mzML/m:instrumentConfigurationList/m:instrumentConfiguration",
c(m="http://psi.hupo.org/ms/mzml"))
configs <- t(sapply(instrumentConfigs, function(ic){
id <- xmlAttrs(ic)[["id"]]
analyzer <- getNodeSet(ic, "m:componentList/m:analyzer/m:cvParam",
c(m="http://psi.hupo.org/ms/mzml"))
analyzerName <- xmlAttrs(analyzer[[1]])[["name"]]
analyzerMSO <- xmlAttrs(analyzer[[1]])[["accession"]]
return(c(id, analyzerName, analyzerMSO))
}))
rownames(configs) <- configs[,1]
colnames(configs) <- c("ID", "name", "ontology")
return(configs)
}