Metabolomics Society Forum

Software => R => XCMS => Topic started by: Metabolomics_R on September 28, 2015, 09:09:42 AM

Title: load a CDF file in R
Post by: Metabolomics_R on September 28, 2015, 09:09:42 AM
I am using your R package called xcms to import cdf files (GC/MS) data.
based on the manual page 74, I use both following command line


path to the data
xr<-xcmsRaw("C:/path to the folder /m.cdf", profstep = 1, profmethod = 'bin',profparam = list(),includeMSn = FALSE,mslevel = NULL, scanrange = NULL)

xr<-xcmsRaw("C:/path to the folder /m.cdf")

Program: C:Program FilesRStudiobinx64rsession.exe
File: posixio.c, Line 325

Expression: pxp->bf_offset <= offset && offset < pxp->bf_offset + (off_t) pxp->bf_extent

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
I have tried everything which did not lead me to a solution and that is why I decided to contact you
if you have any idea how I can fix it,please let me know
Title: Re: load a CDF file in R
Post by: Jan Stanstrup on September 28, 2015, 01:35:11 PM
How large is this file?
Googling it seems that it might be because something is running in 32 bit mode. I don't know the internals well enough to venture a guess to where the problem is or how to fix it.


https://code.zmaw.de/boards/4/topics/468 (https://code.zmaw.de/boards/4/topics/468)
http://www.aps.anl.gov/epics/tech-talk/ ... g01231.php (http://www.aps.anl.gov/epics/tech-talk/2011/msg01231.php)


EDIT: idea: if your files are > 2GB because they are in profile mode you could probably get the size down by converting them to centroid mode with msconvert from proteowizard. You can try that anyway to see if it is a CDF specific issue.
Title: Re: load a CDF file in R
Post by: sneumann on September 28, 2015, 02:57:06 PM
Hi Mohammad,

the file you sent seems to be fine on my Ubuntu Linux box.
What is your operating system and R version ?
Can you run R in a command line without the Rstudio
around it ?

Yours,
Steffem


Code: [Select]
> library(xcms)
> xr <- xcmsRaw("m.cdf")
> xr
An "xcmsRaw" object with 1029 mass spectra

Time range: 1199.8-1500 seconds (20-25 minutes)
Mass range: 28.8909-501.0926 m/z
Intensity range: 0-4153340

MSn data on  0  mass(es)
with  0  MSn spectra
Profile method: bin
Profile step: 1 m/z (473 grid points from 29 to 501 m/z)

Memory usage: 24.5 MB
Title: Re: load a CDF file in R
Post by: Metabolomics_R on September 29, 2015, 12:38:25 AM
Thanks Steffan,
No need to do anything, I simply used R alone, I checked it on unix and windows which both are working, the file was checked (e.g. mzR) and it seems to be alright.
Title: Re: load a CDF file in R
Post by: Metabolomics_R on September 29, 2015, 08:49:36 AM
The problem is still there! i used few data this time and I tried to invoke them all, the small size came in without any problem. the others could not pass through.
the same error appeared and ...
I used different platform, with and without Rstudio, the same error was there !!!
any solution?
Title: Re: load a CDF file in R
Post by: sneumann on September 29, 2015, 09:57:32 AM
Hi,
if your installation works in principle, there is little I can think of.
Is this LECO GCxGC data ?

If your file has  6,479,713  bytes, that's only 6MB, so not huge at all.

For your other question, if you get the xcmsRaw, you find the
Raw data as a matrix by using xr@env$profile if you've set profStep=1
where 1 is the resolution in Da of the matrix.

Yours,
Steffen
Title: Re: load a CDF file in R
Post by: Metabolomics_R on September 29, 2015, 01:01:59 PM
Quote from: "sneumann"
Hi,
if your installation works in principle, there is little I can think of.
Is this LECO GCxGC data ?

If your file has  6,479,713  bytes, that's only 6MB, so not huge at all.

For your other question, if you get the xcmsRaw, you find the
Raw data as a matrix by using xr@env$profile if you've set profStep=1
where 1 is the resolution in Da of the matrix.

Yours,
Steffen


Hello,

No, it is GC-TOF and not GC/GC. and it is about 7GB since I ran a long time sequence.
I sent you an example by email.
about the data matrix, it gives me a matrix but I have no idea what is what. which row is retention and which ones are MZ and ....

Bests,
Title: Re: load a CDF file in R
Post by: sneumann on September 29, 2015, 03:39:00 PM
Hi,

the file you sent loads fine over here, so I expect something in your installation.
If the smaller files load fine, I suspect RAM issues. How much memory do you have ?

Code: [Select]
> library(xcms)
Loading required package: mzR
Loading required package: Rcpp
xr <- xcmsRawLoading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following object is masked from ‘package:stats’:

    xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, as.vector, cbind, colnames,
    duplicated, eval, evalq, Filter, Find, get, intersect, is.unsorted,
    lapply, Map, mapply, match, mget, order, paste, pmax, pmax.int,
    pmin, pmin.int, Position, rank, rbind, Reduce, rep.int, rownames,
    sapply, setdiff, sort, table, tapply, union, unique, unlist

Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

("data2
Attaching package: ‘xcms’

The following object is masked from ‘package:Biobase’:

    phenoData, phenoData<-

> xr <- xcmsRaw("data2.cdf")
> xr
An "xcmsRaw" object with 13257 mass spectra

Time range: 360-4204.2 seconds (6-70.1 minutes)
Mass range: 14.9984-519.9868 m/z
Intensity range: 0-1384450

MSn data on  0  mass(es)
with  0  MSn spectra
Profile method: bin
Profile step: 1 m/z (506 grid points from 15 to 520 m/z)

Memory usage: 5110 MB
> sessionInfo()
R version 3.0.0 Patched (2013-04-04 r62494)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8      LC_NUMERIC=C             
 [3] LC_TIME=de_DE.UTF-8        LC_COLLATE=en_US.UTF-8   
 [5] LC_MONETARY=de_DE.UTF-8    LC_MESSAGES=en_US.UTF-8 
 [7] LC_PAPER=C                LC_NAME=C               
 [9] LC_ADDRESS=C              LC_TELEPHONE=C           
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C     

attached base packages:
[1] parallel  stats    graphics  grDevices utils    datasets  methods 
[8] base   

other attached packages:
[1] xcms_1.43.1        Biobase_2.22.0    BiocGenerics_0.8.0 mzR_2.1.10       
[5] Rcpp_0.11.2     

loaded via a namespace (and not attached):
[1] codetools_0.2-14 zlibbioc_1.8.0 

The matrix that you get has the dimensions:
Code: [Select]
> dim(xr@env$profile)
[1]  506 13257

So the 13257 correspond to the scans, the 506 grid points from 15 to 520 m/z.

Yours,
Steffen
Title: Re: load a CDF file in R
Post by: Metabolomics_R on September 30, 2015, 12:31:56 AM
Quote from: "sneumann"
Hi,

the file you sent loads fine over here, so I expect something in your installation.
If the smaller files load fine, I suspect RAM issues. How much memory do you have ?

Code: [Select]
> library(xcms)
Loading required package: mzR
Loading required package: Rcpp
xr <- xcmsRawLoading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following object is masked from ‘package:stats’:

    xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, as.vector, cbind, colnames,
    duplicated, eval, evalq, Filter, Find, get, intersect, is.unsorted,
    lapply, Map, mapply, match, mget, order, paste, pmax, pmax.int,
    pmin, pmin.int, Position, rank, rbind, Reduce, rep.int, rownames,
    sapply, setdiff, sort, table, tapply, union, unique, unlist

Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

("data2
Attaching package: ‘xcms’

The following object is masked from ‘package:Biobase’:

    phenoData, phenoData<-

> xr <- xcmsRaw("data2.cdf")
> xr
An "xcmsRaw" object with 13257 mass spectra

Time range: 360-4204.2 seconds (6-70.1 minutes)
Mass range: 14.9984-519.9868 m/z
Intensity range: 0-1384450

MSn data on  0  mass(es)
with  0  MSn spectra
Profile method: bin
Profile step: 1 m/z (506 grid points from 15 to 520 m/z)

Memory usage: 5110 MB
> sessionInfo()
R version 3.0.0 Patched (2013-04-04 r62494)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8      LC_NUMERIC=C             
 [3] LC_TIME=de_DE.UTF-8        LC_COLLATE=en_US.UTF-8   
 [5] LC_MONETARY=de_DE.UTF-8    LC_MESSAGES=en_US.UTF-8 
 [7] LC_PAPER=C                LC_NAME=C               
 [9] LC_ADDRESS=C              LC_TELEPHONE=C           
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C     

attached base packages:
[1] parallel  stats    graphics  grDevices utils    datasets  methods 
[8] base   

other attached packages:
[1] xcms_1.43.1        Biobase_2.22.0    BiocGenerics_0.8.0 mzR_2.1.10       
[5] Rcpp_0.11.2     

loaded via a namespace (and not attached):
[1] codetools_0.2-14 zlibbioc_1.8.0 

The matrix that you get has the dimensions:
Code: [Select]
> dim(xr@env$profile)
[1]  506 13257

So the 13257 correspond to the scans, the 506 grid points from 15 to 520 m/z.

Yours,
Steffen



Hello,

I see, for one system I have about 8GB RAM, do you think it is something to do with that?
I will check with a better system and see if I still have a problem.
Thanks for the dim so, it only gets the m/z per scan. I was confused because I was mainly searching for the time rather than m/z

Thanks,