Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Clustering High-Dimensional Data

  •  Clustering high-dimensional data (How high is high-D in clustering?)
    • Many applications: text documents, DNA micro-array data
    • Major challenges: 
      • Many irrelevant dimensions may mask clusters
      • Distance measure becomes meaningless—due to equi-distance
      • Clusters may exist only in some subspaces
  • Methods
    • Subspace-clustering: Search for clusters existing in subspaces of the given high dimensional data space
      • CLIQUE, ProClus, and bi-clustering approaches
  • Dimensionality reduction approaches: Construct a much lower dimensional space and search for clusters there (may construct new dimensions by combining some dimensions in the original data)
    • Dimensionality reduction methods and spectral clustering

Speaker notes:

Content Tools

Sources

There are currently no sources for this slide.