Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Measure the Quality of Clustering

  • Dissimilarity/Similarity metric
    • Similarity is expressed in terms of a distance function, typically metric: d(i, j)
    • The definitions of distance functions are usually rather different for interval-scaled, boolean, categorical, ordinal ratio, and vector variables
    • Weights should be associated with different variables based on applications and data semantics
  • Quality of clustering:
    • There is usually a separate “quality” function that measures the “goodness” of a cluster.
    • It is hard to define “similar enough” or “good enough”
      • The answer is typically highly subjective

Speaker notes:

Content Tools

Sources

There are currently no sources for this slide.