Skip to main content

Topic: Dealing with outliers... (Read 147 times) previous topic - next topic

  • rebeccaweed
  • [*]
Dealing with outliers...
I was checking my data quality before running PCA/PLS, but am confused by different articles I've read concerning this. Basically they say that Hotelling's T2 identifies severe outliers, and DModX identifies moderate outliers. 

What I am seeing in my data:
          -My QC's all cluster to the center of a PCA
          - I have some samples located outside the Hotelling's T2 ellipse.
          - Depending on how many PCA components I include in the DModX plot, I get different samples that are larger than the D-Crit value.

My questions:
           - How should I decide on how many PCA components to include in the DModX?
           - Should I remove all outliers detected by the DModX graph and the Hotelling's T2 plot from my data set to prevent skewing of my PCA/PLS?
           - Is it possible that the outliers could be of interest and I should leave them in further analysis?

I really appreciate any and all advice!!!!
  • Washington State University Institute of Biological Sciences