Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Distance-Based Outlier Detection: A Grid-Based Method

  • Why efficiency is still a concern? When the complete set of objects cannot be held into main memory, cost I/O swapping
  • The major cost: (1) each object tests against the whole data set, why not only its close neighbor? (2) check objects one by one, why not group by group?
  • Grid-based method (CELL): Data space is partitioned into a multi-D grid. Each cell is a hyper cube with diagonal length r/2

  • Pruning using the level-1 & level 2 cell properties:
    • For any possible point x in cell C and any possible point y in a level-1 cell, dist(x,y) ≤ r
    • For any possible point x in cell C and any point y such that dist(x,y) ≥ r, y is in a level-2 cell
  • Thus we only need to check the objects that cannot be pruned, and even for such an object o, only need to compute the distance between o and the objects in the level-2 cells (since beyond level-2, the distance from o is more than r)


Speaker notes:

Content Tools

Sources

There are currently no sources for this slide.