Show Posts - krista

This section allows you to view all Messages made by this member. Note that you can only see Messages made in areas you currently have access to.

Messages - krista

XCMS / Re: Gaussian shape peak filtering

November 05, 2018, 05:13:22 PM

Excellent. I am glad this was such an easy fix.

Johannes - thanks for taking the time to go through and find the error, and for tweaking my code. I have learned a few more short cuts in looking at how you changed the function.

With this, I am not ready to move into XCMS > 3. I look forward to seeing future developments in the program.

I really appreciate all the help.

Cheers,
Krista

XCMS / Re: Gaussian shape peak filtering

October 31, 2018, 10:39:54 AM

Hi Jan,
Sorry...sample code would be helpful. In the section below where I source 'peakShape_XCMS3.r' that is the code in the chunk from my first post (if I could figure out how to post an attachment, I would just attach it...).

Code: [Select]

library(xcms)
library(faahKO)
faahko_3_files <- c(system.file('cdf/KO/ko15.CDF', package = "faahKO"),
                    system.file('cdf/KO/ko16.CDF', package = "faahKO"),
                    system.file('cdf/KO/ko18.CDF', package = "faahKO"))

## Do a quick and dirty preprocessing
od <- readMSData(faahko_3_files, mode = "onDisk")
od <- findChromPeaks(od, param = CentWaveParam(peakwidth = c(30, 80),
                                               noise = 1000))
source("peakShape_XCMS3.r")
xdata <- peakShape_XCMS3(od,cor.val = 0.9,useNoise = 1000)

Then, if I see what the output from od is :

Quote

> od 
MSn experiment data ("XCMSnExp")
Object size in memory: 1.05 Mb
- - - Spectra data - - -
MS level(s): 1
Number of spectra: 3834
MSn retention times: 41:41 - 74:60 minutes
- - - Processing information - - -
Data loaded [Wed Oct 31 12:31:44 2018]
MSnbase version: 2.6.4
- - - Meta data - - -
phenoData
rowNames: ko15.CDF ko16.CDF ko18.CDF
varLabels: sampleNames
varMetadata: labelDescription
Loaded from:
[1] ko15.CDF... [3] ko18.CDF
Use 'fileNames(.)' to see all files.
protocolData: none
featureData
featureNames: F1.S0001 F1.S0002 ... F3.S1278 (3834 total)
fvarLabels: fileIdx spIdx ... spectrum (27 total)
fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'
- - - xcms preprocessing - - -
Chromatographic peak detection:
method: centWave
1205 peaks identified in 3 samples.
On average 402 chromatographic peaks per sample.

Compare to the output from xdata which is:

Quote

> xdata
MSn experiment data ("XCMSnExp")
Object size in memory: 1.04 Mb
- - - Spectra data - - -
MS level(s): 1
Number of spectra: 3834
MSn retention times: 41:41 - 74:60 minutes
- - - Processing information - - -
Data loaded [Wed Oct 31 12:31:44 2018]
MSnbase version: 2.6.4
- - - Meta data - - -
phenoData
rowNames: ko15.CDF ko16.CDF ko18.CDF
varLabels: sampleNames
varMetadata: labelDescription
Loaded from:
[1] ko15.CDF... [3] ko18.CDF
Use 'fileNames(.)' to see all files.
protocolData: none
featureData
featureNames: F1.S0001 F1.S0002 ... F3.S1278 (3834 total)
fvarLabels: fileIdx spIdx ... spectrum (27 total)
fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'
- - - xcms preprocessing - - -
Chromatographic peak detection:
Error in ph[[1]] : subscript out of bounds

Hence the details on the chromatographic peak detection seem to be lost.

XCMS / Gaussian shape peak filtering

October 31, 2018, 08:30:26 AM

Hi,

A while back, Tony Larson posted code on the old Google groups forum. The code went through each mzRT feature and checked for a Gaussian fit for each peak. I have found it is stricter that the 'fitgauss=TRUE' in centWave and using XCMS < 3.0, I implement the code after I do the peakPicking. I am trying to implement this in XCMS > 3.0, but have run into one final hiccup when I try to reinsert my results back into the XCMSnExp object.

The code follows. I have left the 'old code' in the file for now, but commented out. I have managed to figure out how to get the data I want out of the mzML files, but I am stuck at the end (comment says 'STUCK HERE'). It seems like I should be able to use chromPeaks <- to set my answer. It will allow me to do this, but then I lose all the chromatography information that is needed in downstream processing.

Any thoughts? This is the final step keeping me from switching to XCMS > 3.0, so I hope to get it working.

Cheers,
Krista

Code: [Select]

#xdata <- peakShape_XCMS3(my_data,cor.val=0.9, useNoise = setNoise)

#and then proceed with retention time correction and correspondence/grouping
# Decreasing cor.val will allow more non-gaussian peaks through the filter

#original file version from the Google Groups for xcms from Tony Larson
#KL corrected this version 8/23/2011 for XCMS < 3
#KL updating to using XCMS3, 10/30/2018

#original peakShape function to remove non-gaussian peaks from an xcmsSet
#code originally had cor.val = 0.9; 0.5 is too low (not doing enough pruning)
#object is updated to be an XCMSnExp class

peakShape_XCMS3 <- function(object, cor.val=0.9, useNoise = setNoise)
{
require(xcms)

#files <- object@filepaths #old code
files <- fileNames(object)
  
#peakmat <- object@peaks #old code
peakmat <- chromPeaks(object) #extract everything

peakmat.new <- matrix(-1,1,ncol(peakmat)) 
colnames(peakmat.new) <- colnames(peakmat)

for(f in 1:length(files))
        {
        #xraw <- xcmsRaw(files[f], profstep=0) #old code
        raw_data <- readMSData(files[f],msLevel = 1, mode = "onDisk") #use 'onDisk' to make the next step faster
        sub.peakmat <- peakmat[which(peakmat[,"sample"]==f),,drop=F]
        corr <- numeric()
        for (p in 1:nrow(sub.peakmat))
                {
                #old code
                #tempEIC <-
                    #as.integer(rawEIC(xraw,mzrange=c(sub.peakmat[p,"mzmin"]-0.001,sub.peakmat[p,"mzmax"]+0.001))$intensity)
                #minrt.scan <- which.min(abs(xraw@scantime-sub.peakmat[p,"rtmin"]))[1]
                #maxrt.scan <- which.min(abs(xraw@scantime-sub.peakmat[p,"rtmax"]))[1]
                #eics <- tempEIC[minrt.scan:maxrt.scan]
          
                mzRange = c(sub.peakmat[p,"mzmin"]-0.001,sub.peakmat[p,"mzmax"]+0.001)
                subsetOnMZ <- filterMz(raw_data, mz = mzRange)
                
                #now set the Rt range...use sub.peakmat min and max RT
                rtRange = c(sub.peakmat[p,"rtmin"],sub.peakmat[p,"rtmax"])
                subsetOnMZandRT <- filterRt(subsetOnMZ, rt = rtRange)
 
                eics <- intensity(subsetOnMZandRT) #get the intensity values
                eics[sapply(eics, function(x) length(x)==0)] <- 0 #if empty in a scan, convert to 0
                eics <- as.integer(unlist(eics)) #use as.double for Lumos

                #filter out features that are less than the noise level I have already set...
                setThreshold <- which(eics < useNoise)
                eics <- eics[-setThreshold]
                rm(setThreshold)

                #remove any NA (easier bc downstream leaving it in causes issues)
                eics <- eics[!is.na(eics)]
                
                getIdx <- which(eics == min(eics))[1] #if multiple values, just need the first match
                #set min to 0 and normalise
                eics <- eics-eics[getIdx]
                
                if(max(eics,na.rm=TRUE)>0)
                        {
                        eics <- eics/max(eics, na.rm=TRUE)
                        }
                #fit gauss and let failures to fit through as corr=1
                fit <- try(nls(y ~ SSgauss(x, mu, sigma, h), 
                               data.frame(x = 1:length(eics), y = eics)),silent=T)
                
                if(class(fit) == "try-error")
                        {
                        corr[p] <- 1
                        } else {
                        #calculate correlation of eics against gaussian fit
                        if (length(which(!is.na(eics - fitted(fit)))) > 4 &&
                            length(!is.na(unique(eics)))>4 && 
                            length(!is.na(unique(fitted(fit))))>4)
                                {
                                cor <- NULL
                                options(show.error.messages = FALSE)
                                cor <- try(cor.test(eics,fitted(fit),method="pearson",use="complete"))
                                options(show.error.messages = TRUE)
                                if (!is.null(cor))
                                        {
                                        if(cor$p.value <= 0.05) {
                                          corr[p] <- cor$estimate 
                                        } else {
                                          corr[p] <- 0 }
                                        } 
                                  else corr[p] <- 0
                                } else corr[p] <- 0
                        }
                } #this ends to the 'p' loop (going through one mzRT feature at a time)
        
        filt.peakmat <- sub.peakmat[which(corr >= cor.val),]
        peakmat.new <- rbind(peakmat.new, filt.peakmat)
        n.rmpeaks <- nrow(sub.peakmat)-nrow(filt.peakmat)
        cat("Peakshape evaluation: sample ", 
            basename(files[f]),"
            ",n.rmpeaks,"/",nrow(sub.peakmat)," peaks removed","\n")
        
        if (.Platform$OS.type == "windows") flush.console()
        
        
        } #this ends the 'f' loop (going through one file at a time())

peakmat.new <- peakmat.new[-1,] #all but the first row that is all -1
object.new <- object #copy to a new object

#object.new@peaks <- peakmat.new #old code
chromPeaks(object.new) <- peakmat.new #STUCK HERE...why doesn't this work?

return(object.new) 
} #this ends the function itself

XCMS / Re: replacement for getEIC in XCMS3

October 25, 2018, 09:52:29 AM

Excellent, I will keep an eye out for the new version of XCMS3.
Thanks for all your help.
Krista

XCMS / Re: replacement for getEIC in XCMS3

October 24, 2018, 02:52:02 PM

Ack..I just realized I only posted half of the code in my previous post. The first part uses Jo's function as follows:

Code: [Select]

featureChromatograms <- function(x, expandRt = 0, aggregationFun = "max",
                                 ...) {
    if (!hasFeatures(x))
        stop("No feature definitions present. Please run first 'groupChromPeaks'")
    pks <- chromPeaks(x)
    mat <- do.call(rbind, lapply(featureDefinitions(x)$peakidx, function(z) {
        pks_current <- pks[z, , drop = FALSE]
        c(range(pks_current[, c("rtmin", "rtmax")]),
          range(pks_current[, c("mzmin", "mzmax")]))
    }))
    colnames(mat) <- c("rtmin", "rtmax", "mzmin", "mzmax")
    chromatogram(x, rt = mat[, 1:2], mz = mat[, 3:4],
                 aggregationFun = aggregationFun, ...)
}


#now use the code from Jo:
#extract the EICs from the grouped XCMSnExp object:
chrs <- featureChromatograms(xgF)

## Extract also XICs without adjusted retention times
chrs_raw <- featureChromatograms(xgF, adjustedRtime = FALSE)

Then, I use the code I posted earlier today to do the actual plotting (see above). I hope others can take advantage of this.
Krista

XCMS / Re: replacement for getEIC in XCMS3

October 24, 2018, 06:30:35 AM

Jo,
This is perfect. Exactly what I needed. The plots are helpful, and the code to loop over each row in chrs generates the PDF I want. For others, here's what I have:

Code: [Select]

library(xcms)
library(RColorBrewer)

#setup the name of the PDF file that I want:
fName_peaksPicked <- 'yourFilenamehere.pdf'

#preset the colors, use ncol to figure out how many samples I have (and hence # of colors needed)nColors <- ncol(chrs)
useColors <- brewer.pal(nColors,'YlGnBu')

cairo_pdf(file = fName_peaksPicked,onefile=TRUE)
for (i in 1:nrow(chrs)){
    par(mfrow = c(1, 2))
    plot(chrs[i, ], col = useColors)
    plot(chrs_raw[i, ], col = useColors)
    
    # You could also use the highlightChromPeaks function to indicate the identified 
    #peaks in the individual XICs:
    if (TRUE) {
      plot(chrs[i, ], col = useColors)
      highlightChromPeaks(xgF, rt = range(lapply(chrs[i, ], rtime)), 
                        mz = range(lapply(chrs[i, ], mz)),
                        border = useColors)
    }
 }
dev.off()

Thanks again!
Krista

XCMS / Re: replacement for getEIC in XCMS3

October 24, 2018, 05:52:38 AM

Hi Jo,

Thanks for the code. This really helps me out. I will take a look at it and see how things work. Either way I will make sure to post a reply later so that we can see how this works.

Cheers,
Krista

XCMS / Re: replacement for getEIC in XCMS3

October 24, 2018, 05:49:55 AM

Hello,

CoreyG: I have swapped between the XCMSnExp and xcmsSet using the as function. I used that to have CAMERA work for isotopes and adducts. It 'sort-of' works for the code that I posted, although I get an error message during the plotting that I cannot quite sort out:

Error in legend(rtrange[i, 2], max(maxint), legtext, col = unique(col[sampidx]), : 
 'legend' is of length 0

I am not sure where I would set the 'legend' parameters, so I am not sure how to get around this.

Jo: thanks for thinking about how to implement this in XCMS3. I used to use the PDF generated during the centWave peakPicking parameters, but at some point (even prior to XCMS3) that went away. Either is fine with me. I just like to see all/most of my features to see how I am doing beyond a pre-defined set of mz/RT values.

Thanks in advance. I look forward to moving forward with the transition to XCMS3.

Cheers,
Krista

XCMS / replacement for getEIC in XCMS3

October 22, 2018, 02:04:23 PM

I am in the process of updating to use XCMS3. So far, I like XCMS3 and most of the transition to XCMS3 has been easy in terms of how to set peak picking, retention time correction, and defining correspondence across a set of files (the old 'grouping'). However, one thing I miss is the ability to plot all of our mzRT features into a PDF that we could use to assess how well our parameters were working. I find multiple examples to extract a pre-determined mz / RT combination in the vignettes provided with XCMS3. Therefore, if I know the mz and retention time for a single peak, I can easily plot it. However, what I cannot figure out is out to get all the peaks across my set of samples.

I used to use the code listed below. In the example, xgF is the data after peak picking, group, retention time correction, and fillPeaks.

Any thoughts?
Cheers,
Krista

Quote

xs.fill <- xgF
#
xeic.raw <- getEIC(xs.fill, rt = "raw", groupidx= 1:nrow(xs.fill@groups))
xeic.corrected <- getEIC(xs.fill, rt = "corrected", groupidx= 1:nrow(xs.fill@groups))

#colorList should be as long as the number of possible levels in the dataset...
colorList <- c("deepskyblue","magenta","forestgreen","darkorchid","firebrick", "gold")

#plot first feature from diffreport (not sorted after pval!!)
cairo_pdf(file = fName_peaksPicked,onefile=TRUE)
for (i in 1:nrow(xs.fill@groups)){
#for (i in 1:5){

par(mfrow= c(2,1))
#rt row data
plot(xeic.raw, xs.fill,groupidx=i,col=colorList)
#rt corrected data
plot(xeic.corrected, xs.fill,groupidx=i,col=colorList)

}
dev.off()

My sessionInfo()

R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C 
[5] LC_TIME=English_United States.1252 

attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base 

other attached packages:
[1] CAMERA_1.36.0 xcms_3.2.0 MSnbase_2.6.4 ProtGenerics_1.12.0 mzR_2.14.0 
[6] Rcpp_0.12.19 BiocParallel_1.14.2 Biobase_2.40.0 BiocGenerics_0.26.0

loaded via a namespace (and not attached):
 [1] lattice_0.20-35 snow_0.4-3 digest_0.6.18 foreach_1.4.4 
 [5] plyr_1.8.4 backports_1.1.2 acepack_1.4.1 mzID_1.18.0 
 [9] stats4_3.5.1 ggplot2_3.0.0 BiocInstaller_1.30.0 pillar_1.3.0 
[13] zlibbioc_1.26.0 rlang_0.2.2 lazyeval_0.2.1 rstudioapi_0.8 
[17] data.table_1.11.8 S4Vectors_0.18.3 rpart_4.1-13 Matrix_1.2-14 
[21] checkmate_1.8.5 preprocessCore_1.42.0 splines_3.5.1 stringr_1.3.1 
[25] foreign_0.8-71 htmlwidgets_1.3 igraph_1.2.2 munsell_0.5.0 
[29] compiler_3.5.1 pkgconfig_2.0.2 base64enc_0.1-3 multtest_2.36.0 
[33] pcaMethods_1.72.0 htmltools_0.3.6 nnet_7.3-12 tibble_1.4.2 
[37] gridExtra_2.3 htmlTable_1.12 RANN_2.6 Hmisc_4.1-1 
[41] IRanges_2.14.12 codetools_0.2-15 XML_3.98-1.16 crayon_1.3.4 
[45] MASS_7.3-50 grid_3.5.1 MassSpecWavelet_1.46.0 RBGL_1.56.0 
[49] gtable_0.2.0 affy_1.58.0 magrittr_1.5 scales_1.0.0 
[53] graph_1.58.2 stringi_1.2.4 impute_1.54.0 affyio_1.50.0 
[57] doParallel_1.0.14 limma_3.36.5 latticeExtra_0.6-28 Formula_1.2-3 
[61] RColorBrewer_1.1-2 iterators_1.0.10 tools_3.5.1 survival_2.42-6 
[65] yaml_2.2.0 colorspace_1.3-2 cluster_2.0.7-1 vsn_3.48.1 
[69] MALDIquant_1.18 knitr_1.20

Job opportunities / Postdoctoral Investigator in Environmental Metabolomics

November 30, 2017, 09:11:02 AM

We invite applications for one or more postdoctoral researchers in the area of environmental metabolomics and/or environmental analytical chemistry. The Kujawinski lab uses analytical chemistry to explore the intersection of microbial biology and chemistry within marine systems. Potential projects include: use of metabolomics to understand the physiology of model marine organisms, comparison of laboratory and field data to infer the contribution of model organisms to marine organic matter cycling, and others. The successful applicant will be trained in all aspects of environmental sample analysis, including sample processing, mass spectrometry, and data analysis. A PhD in chemical oceanography, biogeochemistry, analytical chemistry or a related field is required. Experience with mass spectrometry and downstream data analysis as well as knowledge of programming tools such as MATLAB, Python, and R would be advantageous. Candidates with interests in linking chemical and biological datasets are especially encouraged to apply. The position is available for one year, renewable for a maximum of two years. Review of applications will begin Feb 1, 2018 and continue until one position is filled. Inquiries should be addressed to ekujawinski@whoi.edu.

Job opportunities / Graduate Research Assistants in Microbial Metabolomics

December 12, 2016, 12:16:53 PM

We invite applications for multiple graduate research assistants in the area of environmental microbial metabolomics and/or environmental analytical chemistry.
Background: Graduate students in the Kujawinski lab at the Woods Hole Oceanographic Institution (Woods Hole, Massachusetts, USA) are enrolled primarily in the MIT/WHOI Joint Program in Applied Ocean Sciences and Engineering. The research group uses analytical chemistry to explore the intersection of microbial biology and chemistry within Earth’s systems, with a specific focus on the oceans. We have access to sophisticated mass spectrometry tools as well as computational resources and develop new tools through which to interrogate the chemical composition of microbes and their environments. We collaborate with biologists in oceanography, soil science and human systems in order to place our work in the broadest context.
My philosophy: Graduate students in my research group are full colleagues in our intellectual enterprise (see projects listed on this website). Students are expected to develop their own research questions and to participate in the research group activities to the fullest. Thus, I support independent creative individuals who are eager to answer sophisticated science questions at the cutting edge of environmental chemistry and microbiology. To the best of my ability, I provide financial support for laboratory research, field activities and networking opportunities. I encourage my students to explore non-traditional career opportunities and to seek out additional training resources, as their time and interests allow, with the goal of training well-rounded scientists for their chosen careers.
Who succeeds: The lab is based on chemistry and so most of our members have training in the chemical arts. However, we welcome chemically-minded individuals with a physical science degree. We develop data analysis pipelines, when needed, and so students are encouraged to learn computer programming in MATLAB, R and/or Python. In my experience, students who have taken at least a year off after their undergraduate studies are more prepared for graduate school, so I give preference to individuals who have experience beyond their undergraduate degree. That said, excellent applications from graduating seniors are always considered.

For additional information about graduate school at WHOI, click here. Questions should be addressed to Elizabeth Kujawinski (ekjuawinski@whoi.edu).

Job opportunities / Postdoctoral Investigator in Environmental Metabolomics

December 12, 2016, 12:12:12 PM

We invite applications for a postdoctoral researcher in the area of environmental metabolomics and/or environmental analytical chemistry. The Kujawinski lab at the Woods Hole Oceanographic Institution (Woods Hole, Massachusetts, USA) uses analytical chemistry to explore the intersection of microbial biology and chemistry within marine systems. The funded project involves complex mixture analysis by qualitative and quantitative mass spectrometry, with a focus on development of new tools for analysis of metabolites dissolved in seawater. The successful applicant will be trained in all aspects of environmental sample analysis, including sample processing, mass spectrometry, and data analysis. A PhD in chemical oceanography, biogeochemistry, analytical chemistry or a related field is required. Experience with mass spectrometry and downstream data analysis as well as knowledge of programming tools such as MATLAB, Python, and R would be advantageous. Candidates with interests in linking chemical and biological datasets are especially encouraged to apply. The position is available for one year, renewable for a maximum of two years. Inquiries should be addressed to ekujawinski@whoi.edu. Please apply online through the Human Resources Department.

XCMS / findmzROI: link information here with output from findPeaks.centWave

August 19, 2016, 02:53:10 PM

Hi folks,

I would like to compare the results from findmzROI with the output from findPeaks.centWave.

In my dataset, I know we have peaks that are not being resolved with our LC method. I would like to know how many different m/z values this is. Here, by not resolved, I mean I have many isomers that span multiple minutes in our LC run. I am most of the way through this thanks to some of the code and responses on this forum. However, I have found some undocumented output from findPeaks.centWave and am hoping that someone knows a little more about the behind the scenes steps to help me understand the output. Here's what I have done:

1. Find the list of ROIs using findmzROI as described by HP Benton (available here)
2. Run findPeaks.Centwave using my parameters of choice and verbose.columns = TRUE.

After step 1, I have a list of features (mz, mzmin,mzmax, scmin, scmax, length, intensity).
After step 2, I have everything described in the help, and a few additional columns: lmin, lmax, sample.

What I would like to do is to directly tie my results from step 1 with the shorter list in step #2. However, I don't see any obvious index/marker that does this. I could search with some narrow m/z and retention time window, but I would rather know which ROI corresponds to which feature without setting some bounds on search parameters.

I hoped that 'f' (described as the 'region number of m/z ROI where peak was localised') would be helpful, but I am not sure how to use this information.

Thanks in advance for thoughts on this.

From sessionInfo()
R version 3.0.3 (2014-03-06)
XCMS_1.38.0

XCMS / Re: [QUESTION] Package that would filter the "bad gaussian curves" kept by XCMS

August 17, 2016, 08:01:57 AM

I do set fitgauss to TRUE when I do the centwave step. However, I was finding that I still had peaks that were not as high quality as I hoped. The peakShape code allows me to be stricter in requiring peaks to meet a Gaussian fit. The code also has the benefit that I can set how strict I want to be by setting the correlation value higher or lower.

XCMS / Re: [QUESTION] Package that would filter the "bad gaussian curves" kept by XCMS

August 16, 2016, 11:01:45 AM

Pre-2011, I found a script on the XCMS Google groups forum that was written by Tony Larson. I updated and now use it regularly to filter out peaks that don't meet Gaussian criteria. I run this script immediately after the peak picking in XCMS.

I hope this helps,
Krista

The script is as follows:

Code: [Select]

#original file version from the Google Groups for xcms from Tony Larson
#Krista Longnecker updated this August 2011 

#peakShape function to remove non-gaussian peaks from an xcmsSet
#code originally had cor.val = 0.9; 0.5 is too low (not doing enough pruning)
peakShape <- function(object, cor.val=0.9)
{
require(xcms)

files <- object@filepaths
peakmat <- object@peaks
peakmat.new <- matrix(-1,1,ncol(peakmat))
colnames(peakmat.new) <- colnames(peakmat)
for(f in 1:length(files))
        {
        xraw <- xcmsRaw(files[f], profstep=0)
        sub.peakmat <- peakmat[which(peakmat[,"sample"]==f),,drop=F]
        corr <- numeric()
        for (p in 1:nrow(sub.peakmat))
                {
                #extract using rawEIC method +/1 0.01 m/z to give smoother traces
                tempEIC <-
as.integer(rawEIC(xraw,mzrange=c(sub.peakmat[p,"mzmin"]-0.001,sub.peakmat[p,"mzmax"]+0.001))$intensity)
                minrt.scan <- which.min(abs(xraw@scantime-sub.peakmat[p,"rtmin"]))[1]
                maxrt.scan <- which.min(abs(xraw@scantime-sub.peakmat[p,"rtmax"]))[1]
                eics <- tempEIC[minrt.scan:maxrt.scan]
                #set min to 0 and normalise
                eics <- eics-min(eics)
                if(max(eics)>0)
                        {
                        eics <- eics/max(eics)
                        }
                #fit gauss and let failures to fit through as corr=1
                fit <- try(nls(y ~ SSgauss(x, mu, sigma, h), data.frame(x =
1:length(eics), y = eics)),silent=T)
                if(class(fit) == "try-error")
                        {
                        corr[p] <- 1
                        } else
                        {
                        #calculate correlation of eics against gaussian fit
                        if(length(which(!is.na(eics-fitted(fit)))) > 4 &&
length(!is.na(unique(eics)))>4 && length(!is.na(unique(fitted(fit))))>4)
                                {
                                cor <- NULL
                                options(show.error.messages = FALSE)
                                cor <- try(cor.test(eics,fitted(fit),method="pearson",use="complete"))
                                options(show.error.messages = TRUE)
                                if (!is.null(cor))
                                        {
                                        if(cor$p.value <= 0.05) corr[p] <- cor$estimate else corr[p] <- 0
                                        } else corr[p] <- 0
                                } else corr[p] <- 0
                        }
                }
        filt.peakmat <- sub.peakmat[which(corr >= cor.val),]
        peakmat.new <- rbind(peakmat.new, filt.peakmat)
        n.rmpeaks <- nrow(sub.peakmat)-nrow(filt.peakmat)
        cat("Peakshape evaluation: sample ", sampnames(object)[f],"
",n.rmpeaks,"/",nrow(sub.peakmat)," peaks removed","\n")
        if (.Platform$OS.type == "windows") flush.console()
        }

peakmat.new <- peakmat.new[-1,]

object.new <- object
object.new@peaks <- peakmat.new
return(object.new) 
}