Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Assessing Clustering Tendency

  • Assess if non-random structure exists in the data by measuring the probability that the data is generated by a uniform data distribution
  • Test spatial randomness by statistic test: Hopkins Static
    • Given a dataset D regarded as a sample of a random variable o, determine how far away o is from being uniformly distributed in the data space
    • Sample n points, p1, …, pn, uniformly from D. For each pi, find its nearest neighbor in D: xi = min{dist (pi, v)} where v in D
    • Sample n points, q1, …, qn, uniformly from D. For each qi, find its nearest neighbor in D – {qi}: yi = min{dist (qi, v)} where v in D and v ≠ qi
    • Calculate the Hopkins Statistic:

\[H=\frac{\sum_{i=1}^{n}y_{i}}{\sum_{i=1}^{n}x_{i}+\sum_{i=1}^{n}y_{i}}\]

    • If D is uniformly distributed, ∑ xi and ∑ yi will be close to each other and H is close to 0.5. If D is clustered, H is close to 1

Speaker notes:

Content Tools

Sources

There are currently no sources for this slide.