Metabolomics Society Forum

Software => Other => Topic started by: metaphase on December 11, 2012, 03:41:17 AM

Title: Binning software
Post by: metaphase on December 11, 2012, 03:41:17 AM
Hello to everyone!

What software/R package or script are you using to bin (bucket) data from LC-MS?

Currently I am using an in-house built software that bins multiple txt files (different samples) into one txt file that I can do statistical analyses on. It works great when binning the m/z values to 1 Da, but when I try to bin it to 0.1 Da, it takes ages to do that.

Is there any good software that can do the same thing but faster?

Thanks in advance!
Title: Re: Binning software
Post by: sneumann on December 11, 2012, 07:41:11 AM
Hi,

if you can do R, just try XCMS. The profile matrix will do that.
You find it in xcmsRaw@env$profile, check the documentation
for profMethod and profStep.

Yours,
Steffen
Title: Re: Binning software
Post by: metaphase on December 11, 2012, 08:07:32 AM
I read that you can import NetCDF or mzXML file formats...do the txt files qualify? How would I go about importing all of the data into one data frame?

I tried this script:

list.files()
filenames <- list.files()
filenames
negmet <- data.frame(do.call("rbind",lapply(filenames,read.delim,header=FALSE)))
viewData(negmet)

But it just compiles all of samples into 2 columns so it isn't really useful.
Sorry for asking such basic questions, I have tried to figure this out on my own for several days and just can't find a good solution.
Title: Re: Binning software
Post by: sneumann on December 11, 2012, 10:04:54 AM
Hi,

Quote from: "metaphase"
I read that you can import NetCDF or mzXML file formats...do the txt files qualify?

no, txt usually does not qualify, raw data makes more sense for the profile matrix.

Steffen
Title: Re: Binning software
Post by: Carsten on December 11, 2012, 10:30:46 AM
Quote from: "metaphase"
Hello to everyone!

What software/R package or script are you using to bin (bucket) data from LC-MS?

Feeding an xcmsRaw with a txt file could be a painful approach. If you only want to bin values have a look at the R function cut.
Here is a small example, which you could use as a starting point

Code: [Select]
#Generate some data
x <- rnorm(1000,0,10)
#sort data
x.sort <- sort(x)
#take a quick look
head(x.sort)
#[1] -31.44805 -29.43685 -28.45767 -28.18506 -26.27450 -25.92044
#generate bins with bin size of 0.1 over whole data range
bin <- seq(min(x)-0.1,max(x)+0.1,by=0.1)
#take a look at the bins
head(bin)
#[1] -31.54805 -31.44805 -31.34805 -31.24805 -31.14805 -31.04805
#So first bin starts at -31.54805 to -31.44805
binned.x=cut(x, breaks=bin, labels=FALSE)
head(binned.x)
#[1]  1 22 31 34 53 57
#As an results you retrieve bin indes, so first x values goes into bin of -31.54805 to -31.44805

Cheers,
Carsten
Title: Re: Binning software
Post by: metaphase on December 14, 2012, 12:55:37 AM
Thanks Carsten!

I will try it out! :)