Current Slide
Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.
Bi-Clustering (II): δ-pCluster
- Enumerating all bi-clusters (δ-pClusters) [H. Wang, et al., Clustering by pattern similarity in large data sets. SIGMOD’02]
- Since a submatrix I x J is a bi-cluster with (perfect) coherent values iff ei1j1 − ei2j1 = ei1j2 − ei2j2. For any 2 x 2 submatrix of I x J, define p-score
\[p-score \begin{pmatrix} e_{i1j1} &e_{i1j2} \\ e_{i2j1}&e_{i2j2} \end{pmatrix} =|(e_{i1j1}-e_{i2j1})-(e_{i1j2}-e_{i2j2})|\]
- A submatrix I x J is a δ-pCluster (pattern-based cluster) if the p-score of every 2 x 2 submatrix of I x J is at most δ, where δ ≥ 0 is a threshold specifying a user's tolerance of noise against a perfect bi-cluster
- The p-score controls the noise on every element in a bi-cluster, while the mean squared residue captures the average noise
- Monotonicity: If I x J is a δ-pClusters, every x x y (x,y ≥ 2) submatrix of I x J is also a δ-pClusters.
- A δ-pCluster is maximal if no more row or column can be added into the cluster and retain δ-pCluster: We only need to compute all maximal δ-pClusters.
Speaker notes:
Content Tools
Tools
Sources (0)
Tags (0)
Comments (0)
History
Usage
Questions (0)
Playlists (0)
Quality
Sources
There are currently no sources for this slide.