Metabolomics Society Forum

Software => XCMS => R => XCMS - Cookbook => Topic started by: hpbenton on August 10, 2011, 07:05:49 PM

Title: PLS with xcms in R
Post by: hpbenton on August 10, 2011, 07:05:49 PM
A quick example with the pls package. A good description is here: R News:PLS (http://http://cran.r-project.org/doc/Rnews/Rnews_2006-3.pdf) and the package is here: PLS package (http://http://cran.r-project.org/web/packages/pls/)

First open up the required library packages and use the example faahKO dataset. We'll create a peak list using the xcms pipeline.
Code: [Select]
library(xcms)
library(faahKO)
library(pls)
cdfpath <- system.file("cdf", package = "faahKO")
cdffiles <- list.files(cdfpath, recursive = TRUE,full=T)
faahko <- xcmsSet(cdffiles)
faahko<-group(faahko)
faahko<-fillPeaks(faahko)
values<-groupval(faahko)

Now that we have the intensity values from the peak list stored in values we can use the pls package. We will use the class of the groups and the metabolite intensities(Met) for our regression. Of course the class could be changed with any other variable.
Code: [Select]
val<-data.frame(as.numeric(sampclass(faahko)))
val[,2]<-round(t(values),4)
colnames(val)<-c("class", "Met")

faahko.pls <- plsr(class ~ Met, ncomp = 3, data=val, validation = "LOO")
plot(RMSEP(faahko.pls), legendpos = "topright")
dev.new()
plot(faahko.pls, ncomp = 3, asp = 1, line = TRUE)
dev.new()
## to do this manually

scoreplot(faahko.pls, comps = 1:3, identify = FALSE, type = "p", col=rep(c("red", "blue"), c(6,6)), pch=16 )
loadingplot(faahko.pls, comps = 1:3, identify = FALSE, type= "p")
## use identify == TRUE if you want to identify your loadings
## these will be printed to the R console



Finally to get numerical data on the cross validation and other useful data get the summary of the object.
Code: [Select]
summary(faahko.pls)
Data:    X dimension: 12 407
   Y dimension: 12 1
Fit method: kernelpls
Number of components considered: 3

VALIDATION: RMSEP
Cross-validated using 12 leave-one-out segments.
      (Intercept)  1 comps  2 comps  3 comps
CV          0.5455  0.2885  0.1602  0.1468
adjCV      0.5455  0.2875  0.1561  0.1421

TRAINING: % variance explained
      1 comps  2 comps  3 comps
X        60.65    72.91    78.83
class    77.10    97.13    99.20