Skip to main content
Topic: PCA & MDS examples (Read 7126 times) previous topic - next topic

PCA & MDS examples

The following code, below is an example using the FaahKO package for Principal Component Analysis (PCA) and Multi-Dimensional Scaling (MDS).

First load the required packages and then use xcms to get the filled peak data.
Code: [Select]
library(xcms)
library(pcaMethods)
cdfpath <- system.file("cdf", package = "faahKO")
cdffiles <- list.files(cdfpath, recursive = TRUE,full=T)
xset <- xcmsSet(cdffiles)
xsg <- group(xset)
xsg <- retcor(xsg)
xsg <- group(xsg,bw=10)
xsg <- fillPeaks(xsg)

Next we'll use some xcms commands to get the peak intensities out of the xcmsSet object and normalise the data. You can write you're own normalisation function or do mean centring etc.
Code: [Select]
## Get peak intensity values
values <- groupval(xsg, value="into")

## Numerical matrix with with samples in rows and variables as columns
data <- t(values)

## Normalize each mass signal to max=1
for (r in 1:ncol(data))
    data[,r] <- data[,r] / max(data[,r])

Finally we have something that we can put into the pca algorithm and plot.
Code: [Select]
##  PCA Example
pca.result <- pca(data, nPcs = 3, scale="none", cv="q2")

## Get the estimated principal axes (loadings)
loadings <- pca.result@loadings

## Get the estimated scores
scores <- pca.result@scores

## Now plot the scores
plotPcs(pca.result, type = "scores", col=as.integer(sampclass(xsg)) + 1)
dev.new()
## look to see if the model is valid
plot(pca.result)

Then have a look with the validity of the model using a Q2 values.
Loadings will let you know which metabolite is effecting the model and which metabolites are causing the differentiation.

Code: [Select]
plot(loadings[,1], loadings[,2], pch=16, cex=0.5)
text(loadings[,1], loadings[,2], rownames(values), col="red", cex=0.5)

For the MDS example we'll use the same data as before.
Code: [Select]
## MDS Example
library(MASS)

## MDS
data.dist <- dist(data)
mds <- isoMDS(data.dist)
plot(mds$points, type = "n")
text(mds$points, labels = rownames(data),col=as.integer(sampclass(xsg)) + 1)

To read up on the topic a good starting place is:
MDS - StatSoft
PCA - StatSoft
        PCA wikipedia

[attachment deleted by admin]