Measuring Clustering Quality

  • Two methods: extrinsic vs. intrinsic
  • Extrinsic: supervised, i.e., the ground truth is available
    • Compare a clustering against the ground truth using certain clustering quality measure
    • Ex. BCubed precision and recall metrics
  • Intrinsic: unsupervised, i.e., the ground truth is unavailable
    • Evaluate the goodness of a clustering by considering how well the clusters are separated, and how compact the clusters are
    • Ex. Silhouette coefficient

