Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Clustering-Based Outlier Detection (1 & 2):Not belong to any cluster, or far from the closest one

  • An object is an outlier if (1) it does not belong to any cluster, (2) there is a large distance between the object and its closest cluster , or (3) it belongs to a small or sparse cluster
  • Case I: Not belong to any cluster
    • Identify animals not part of a flock: Using a density-based clustering method such as DBSCAN
  • Case 2: Far from its closest cluster
    • Using k-means, partition data points of into clusters
    • For each object o, assign an outlier score based on its distance from its closest center
      • If dist(o, co)/avg_dist(co) is large, likely an outlier
  • Ex. Intrusion detection: Consider the similarity between data points and the clusters in a training data set
  • Use a training set to find patterns of “normal” data, e.g., frequent itemsets in each segment, and cluster similar connections into groups
  • Compare new data points with the clusters mined—Outliers are possible attacks


Speaker notes:

Content Tools

Sources

There are currently no sources for this slide.