Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Representing Features

  • Similarity between tuples t 1 and t 2 w.r.t. categorical feature f
    • Cosine similarity between vectors f ( t 1) and f ( t 2)

\[sim_f(t_{1},t_{2})=\frac{\sum_{k=1}^{L}f(t_{1}).P_{k}.f(t_{2}).P_{k}}{\sqrt{\sum_{k=1}^{L}f(t_{1}).P_{k}^{2}}.\sqrt{\sum_{k=1}^{L}f(t_{2}).P_{k}^{2}}}\]

  • Most important information of a feature f is how f groups tuples into clusters
  • f is represented by similarities between every pair of tuples indicated by f
  • The horizontal axes are the tuple indices, and the vertical axis is the similarity
  • This can be considered as a vector of N x N dimensions  


Speaker notes:

Content Tools

Sources

There are currently no sources for this slide.