Skip to main content
Topic: [QUESTION] Package that would filter the "bad gaussian curves" kept by XCMS (Read 4422 times) previous topic - next topic

[QUESTION] Package that would filter the "bad gaussian curves" kept by XCMS

Hello everyone,

This is my first message to this forum, so first of all I'd like to thank the administrators, whom already helped me by creating it, indeed I could find a lot of answers to my questions!

Now, I am hoping that some of you might be able to answer this question:

As you do, we classically process MS data with XCMS and perform deconvolutions with the help of CAMERA. And as you already know, a challenge of metabolomic assays is to deal with the huge amount of observed variables.

Therefore, I'd like to know if an XCMS-dependant R package aiming to filter the "bad gaussians" kept by XCMS and processed in CAMERA already exists. Indeed it is quite time consuming to check at every single EIC, and it appears that there often are "wrong matches" of adducts/fragments in CAMERA due to peak shape.

I might be dreaming but that would be great!

I hope I was clear enough, and thank you for your time,

Antoine Escourrou


Re: [QUESTION] Package that would filter the "bad gaussian curves" kept by XCMS

Reply #2
Thank you Jordi for this quick answer.

IPO is a package that we already use too, and indeed it is part of the answer to my question, as we can perform automatization of XCMS parameters regarding the data distribution. This is quite useful for the data treatment to be more reliable, but still there is no function to filter the "bad gaussians" that would remain even after XCMS and IPO treatment.

Then my question still remains despite your useful answer!

Thank you again,

Antoine

Re: [QUESTION] Package that would filter the "bad gaussian curves" kept by XCMS

Reply #3
Dear Antoine,

You could also look at XCMS Online. In the mobile app we have release a quick EIC sorter named Hot or Not. This allows you to quickly annotate the good and bad EICs in the dataset. The annotations can then be used as a filter in the table view.

Cheers, Paul
~~
H. Paul Benton
Scripps Research Institute
If you have an error with XCMS Online please send me the JOBID and submit an error via the XCMS Online contact page

Re: [QUESTION] Package that would filter the "bad gaussian curves" kept by XCMS

Reply #4
Thank you Paul for this answer! I did not know at all that there was a mobile app for XCMS online. This single information will be useful for me and surely for others as well!

This is really user-friendly, and allows me to save time during re-processing, which is what I was ultimately looking for, so thank you again !

Re: [QUESTION] Package that would filter the "bad gaussian curves" kept by XCMS

Reply #5
Pre-2011, I found a script on the XCMS Google groups forum that was written by Tony Larson. I updated and now use it regularly to filter out peaks that don't meet Gaussian criteria. I run this script immediately after the peak picking in XCMS.

I hope this helps,
Krista

The script is as follows:


Code: [Select]
#original file version from the Google Groups for xcms from Tony Larson
#Krista Longnecker updated this August 2011

#peakShape function to remove non-gaussian peaks from an xcmsSet
#code originally had cor.val = 0.9; 0.5 is too low (not doing enough pruning)
peakShape <- function(object, cor.val=0.9)
{
require(xcms)

files <- object@filepaths
peakmat <- object@peaks
peakmat.new <- matrix(-1,1,ncol(peakmat))
colnames(peakmat.new) <- colnames(peakmat)
for(f in 1:length(files))
        {
        xraw <- xcmsRaw(files[f], profstep=0)
        sub.peakmat <- peakmat[which(peakmat[,"sample"]==f),,drop=F]
        corr <- numeric()
        for (p in 1:nrow(sub.peakmat))
                {
                #extract using rawEIC method +/1 0.01 m/z to give smoother traces
                tempEIC <-
as.integer(rawEIC(xraw,mzrange=c(sub.peakmat[p,"mzmin"]-0.001,sub.peakmat[p,"mzmax"]+0.001))$intensity)
                minrt.scan <- which.min(abs(xraw@scantime-sub.peakmat[p,"rtmin"]))[1]
                maxrt.scan <- which.min(abs(xraw@scantime-sub.peakmat[p,"rtmax"]))[1]
                eics <- tempEIC[minrt.scan:maxrt.scan]
                #set min to 0 and normalise
                eics <- eics-min(eics)
                if(max(eics)>0)
                        {
                        eics <- eics/max(eics)
                        }
                #fit gauss and let failures to fit through as corr=1
                fit <- try(nls(y ~ SSgauss(x, mu, sigma, h), data.frame(x =
1:length(eics), y = eics)),silent=T)
                if(class(fit) == "try-error")
                        {
                        corr[p] <- 1
                        } else
                        {
                        #calculate correlation of eics against gaussian fit
                        if(length(which(!is.na(eics-fitted(fit)))) > 4 &&
length(!is.na(unique(eics)))>4 && length(!is.na(unique(fitted(fit))))>4)
                                {
                                cor <- NULL
                                options(show.error.messages = FALSE)
                                cor <- try(cor.test(eics,fitted(fit),method="pearson",use="complete"))
                                options(show.error.messages = TRUE)
                                if (!is.null(cor))
                                        {
                                        if(cor$p.value <= 0.05) corr[p] <- cor$estimate else corr[p] <- 0
                                        } else corr[p] <- 0
                                } else corr[p] <- 0
                        }
                }
        filt.peakmat <- sub.peakmat[which(corr >= cor.val),]
        peakmat.new <- rbind(peakmat.new, filt.peakmat)
        n.rmpeaks <- nrow(sub.peakmat)-nrow(filt.peakmat)
        cat("Peakshape evaluation: sample ", sampnames(object)[f],"
",n.rmpeaks,"/",nrow(sub.peakmat)," peaks removed","\n")
        if (.Platform$OS.type == "windows") flush.console()
        }

peakmat.new <- peakmat.new[-1,]

object.new <- object
object.new@peaks <- peakmat.new
return(object.new)
}


Re: [QUESTION] Package that would filter the "bad gaussian curves" kept by XCMS

Reply #6
Interesting. But centwave already output "egauss" (RMSE of Gaussian fit). Couldn't you just filter by that? As far as I can read in the code it is basically doing the same. Just using correlation instead of RMSE.
Blog: stanstrup.github.io

 

Re: [QUESTION] Package that would filter the "bad gaussian curves" kept by XCMS

Reply #7
I do set fitgauss to TRUE when I do the centwave step. However, I was finding that I still had peaks that were not as high quality as I hoped. The peakShape code allows me to be stricter in requiring peaks to meet a Gaussian fit. The code also has the benefit that I can set how strict I want to be by setting the correlation value higher or lower.