Skip to main content
Topic: Dealing with outliers... (Read 546 times) previous topic - next topic

Dealing with outliers...

I was checking my data quality before running PCA/PLS, but am confused by different articles I've read concerning this. Basically they say that Hotelling's T2 identifies severe outliers, and DModX identifies moderate outliers. 

What I am seeing in my data:
          -My QC's all cluster to the center of a PCA
          - I have some samples located outside the Hotelling's T2 ellipse.
          - Depending on how many PCA components I include in the DModX plot, I get different samples that are larger than the D-Crit value.

My questions:
           - How should I decide on how many PCA components to include in the DModX?
           - Should I remove all outliers detected by the DModX graph and the Hotelling's T2 plot from my data set to prevent skewing of my PCA/PLS?
           - Is it possible that the outliers could be of interest and I should leave them in further analysis?

I really appreciate any and all advice!!!!