Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Zipf’s law

  • Heaps’ law gives the vocabulary size in collections.

  • We also study the relative frequencies of terms.

  • In natural language, there are a few very frequent terms and very many very rare terms.

  • Zipf’s law: The ith most frequent term has frequency proportional to 1/i .

  • cfi ∝ 1/i = K/i where K is a normalizing constant

  • cfi is collection frequency: the number of occurrences of the term ti in the collection.


Speaker notes:

Content Tools

Sources

There are currently no sources for this slide.