# Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

### Assessing Clustering Tendency

- Assess if non-random structure exists in the data by measuring the probability that the data is generated by a uniform data distribution
- Test spatial randomness by statistic test: Hopkins Static
- Given a dataset D regarded as a sample of a random variable o, determine how far away o is from being uniformly distributed in the data space
- Sample
*n*points,*p1, …, pn*, uniformly from D. For each pi, find its nearest neighbor in D:*xi*=*min{dist (pi, v)}*where*v*in D - Sample
*n*points,*q1, …, qn*, uniformly from D. For each*qi*, find its nearest neighbor in D – {*qi*}:*yi*=*min{dist (qi, v)}*where*v*in D and v ≠*qi* - Calculate the Hopkins Statistic:

\[H=\frac{\sum_{i=1}^{n}y_{i}}{\sum_{i=1}^{n}x_{i}+\sum_{i=1}^{n}y_{i}}\]

- If D is uniformly distributed, ∑ xi and ∑ yi will be close to each other and H is close to 0.5. If D is clustered, H is close to 1

**Speaker notes:**

## Content Tools

Tools

Sources (0)

Tags (0)

Comments (0)

History

Usage

Questions (0)

Playlists (0)

Quality

### Sources

There are currently no sources for this slide.