Current Slide
Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.
Approach II: Finding Outliers in Subspaces
- Extending conventional outlier detection: Hard for outlier interpretation
- Find outliers in much lower dimensional subspaces: easy to interpret why and to what extent the object is an outlier
- E.g., find outlier customers in certain subspace: average transaction amount >> avg. and purchase frequency << avg.
- Ex. A grid-based subspace outlier detection method
- Project data onto various subspaces to find an area whose density is much lower than average
- Discretize the data into a grid with φ equi-depth (why?) regions
- Search for regions that are significantly sparse
- Consider a k-d cube: k ranges on k dimensions, with n objects
- If objects are independently distributed, the expected number of objects falling into a k-dimensional region is (1/ φ)kn = fkn,the standard deviation is
\[ \sqrt{f^{k}(1-f^{k})n} \]
- The sparsity coefficient of cube C:
\[ S(C)=\frac{n(C)-f^{k}n}{\sqrt{f^{k}(1-f^{k})n}} \]
- If S(C) < 0, C contains less objects than expected
- The more negative, the sparser C is and the more likely the objects in C are outliers in the subspace
Speaker notes:
Content Tools
Tools
Sources (0)
Tags (0)
Comments (0)
History
Usage
Questions (0)
Playlists (0)
Quality
Sources
There are currently no sources for this slide.