Hi, I have a problem to work with a large data set (about 200 samples). In particular I have a problem to creaete the xcmsSet:
xset <- xcmsSet(data.set, method="centWave",
polarity="positive", ppm=10, snthr=15,
peakwidth=c(4,15))
In particular, after a few times, about 2 hours, I encountered several errors such as:
Detecting mass traces at 1o ppm ...
% finished: 0 10 Error in .local(object, ...) :
m/z sort assumption violated ! (scan 376, p 63, current 100.9567 (I=1708.65), last 843.6795)or
17_GS34_A.mzdata: Error in rampSIPeaks(rampid, scans, scanHeaders$peaksCount[scans]) :
unexpected end of peak list
Calls: xcmsSet -> xcmsRaw -> rampRawData -> rampSIPeaks -> .CallI have no idea about the problem, do you have any suggestion??
I also tryed to use a loop to create an xcmsSet object:
for(i in 1:3){
xset[i] <- xcmsSet(data.set[i], method="centWave",
polarity="positive", ## prefilter=c(3,5000),
ppm=10, snthr=1500, peakwidth=c(4,15))
foo <- c(xset[i])
}
but the R console say:
Error in xset <- xcmsSet(data.set, method = "centWave", polarity = "positive", :
object of type 'S4' is not subsettable
Best
Ricca,
For the mz sort violation you can try running the code below. Change type to what you need i.e. .mzData .mzMl , .mzXML .
For the seconded problem not sure. It sounds as if your mzXML/mzData files are corrupt. you could try code :
rampid<-xcms:::rampOpen("MyFile.mzXML")
rampid
rampHead<-xcms:::rampScanHeaders(rampid)
head(rampHead)
raw<-xcms:::rampRawData(rampid)
Let us know how this goes.
## code for mz sort violation
AllCDFs<-list.files(recursive=TRUE, pattern="cdf", ignore.case=TRUE, full.names=TRUE)
apply(AllCDFs, 1, CheckCDFfile, type=".cdf")
checkCDFfile<-function(file, type=".cdf"){
cat("n")
cat(paste("Loading File:", file, sep=""))
xr<-xcmsRaw(file, profstep=0)
for(i in 1:length(xr@scanindex)){
scan<-getScan(xr, scan=i)
if(is.unsorted(scan[,"mz"]) == TRUE){
cat(" x ")
newfile<-sub(type, "-Fixed.mzdata", file, ignore.case=TRUE, fixed=TRUE)
write.mzdata(xr, newfile)
file.copy(file, sub(type, ".OLD", file, ignore.case=TRUE))
unlink(file)
return(1)
}
if(i == length(xr@scanindex)){
cat(" O ")
return(0)
}
}
}
I try the first code:
rampid<-xcms:::rampOpen("17_GS34_A.mzdata")
rampid
[1] -1
rampHead<-xcms:::rampScanHeaders(rampid)
Error in xcms:::rampScanHeaders(rampid) : invalid rampid
For the second code you wrote I have a doubt about its use because it is for a .cdf file while I have an .mzdata
Ricca,
Just change the cdf to what you need. It'll work for any data type as long as the file can be read into xcms. The function will be the same but calling it will be:
AllCDFs<-list.files(recursive=TRUE, pattern="mzdata", ignore.case=TRUE, full.names=TRUE)
apply(AllCDFs, 1, CheckCDFfile, type=".mzdata")
The rampid is odd. :? From memory it should be 0 or higher. You're loading the file that had the problem and not some other one? I would have a look with something else like OpenMS or mzViewer (http://http://www.bioinformatics.bbsrc.ac.uk/projects/mzviewer/) just to check the file loads.
Paul
Paul,
I follow your instruction but apply give to me a bad response:
Error in apply(AllCDFs, 1, CheckCDFfile, type = ".mzdata") : dim(X) must have a positive lengthYes
I used mzMine and all works
I relly have no idea.... :?:
Ricca
opps!
I did apply and I should have sapply. So code should be:
AllCDFs<-list.files(recursive=TRUE, pattern="cdf", ignore.case=TRUE, full.names=TRUE)
sapply(AllCDFs, CheckCDFfile)
Should work now, sorry friday afternoon brain.
Paul,
now it works fine, all the files are characterized by a 0 and so there are no corrupted file... but the problem still remain...
Do you think is possible that the problem is the PC?? I work with a quad core, 7gb ram, ubuntu 10.04 workstation.
Ricca,
What converted did you use? For the 2nd error message it sounds like the converted didn't convert the files correctly!
2nd error message
I would, if possible remove these files and process without them. It's also worth trying another converter. The mz sort violation should all be solved right?
Paul
Paul,
maybe I discovered the problem. I tried to execute the R code in another linux machine and I haven't still had any problem... I hope the problem is only in the computer.
I'm sorry I made you waste your time and I would like to thank you for your help and your suggestion.
Best regards
Riccardo
P.S. I have a further question, I have no experience about the S4 programming rule and I see that the xcms library is written using S4 object. Where can I find information about S4 object programming?? For example how can I perform an easy loop code such this with S4 object?
for(i in 1:3){
xset[i] <- xcmsSet(data.set[i], method="centWave",
polarity="positive", ## prefilter=c(3,5000),
ppm=10, snthr=1500, peakwidth=c(4,15))
foo <- c(xset[i])
}
Hey,
No problem, just happy we found the problem :)
In the code you wrote what is 'data.set
'? The xcmsSet method takes files as the first argument and doesn't need them subset-ed. One xcmsSet is multiple files, and xcmsRaw object are single files.
The nice thing with xcms is that there are a lot of methods to extract the data you need. However, you can also access the slot directly. Here is some code for both:
library(xcms)
library(faahko)
gxs<-group(faahko) ## I'll just use the example dataset to skip the xcmsSet method
pairs<-groups(gxs)
head(pairs) ## will give the mass/rt pairs
val<-groupval(gxs, "medret", "into") ## this returns the intensity values for each feature from each file
This code uses nice object code and really is hte way it should be done. However, sometimes its just easier to use the slots themselves. Note thought that the groupval method is the best way to get the intensity values for the features, as there is no slot for this, as they are defined from slot peaks and groupidx(see below).
names(attributes(gxs))## these are all of the slots so..
head(gxs@groups)
head(gxs@peaks)
class(gxs@groupidx)
length(gxs@groupidx)
gxs@groupidx[[1]]
Good general guides to S4 classes are :
https://www.rmetrics.org/files/Meielisa ... alabi1.pdf (https://www.rmetrics.org/files/Meielisalp2009/Presentations/Chalabi1.pdf)
http://www.r-bloggers.com/resources-for ... d-methods/ (http://www.r-bloggers.com/resources-for-s4-classes-and-methods/)