Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Document frequency, continued

  • Frequent terms are less informative than rare terms

  • Consider a query term that is frequent in the collection (e.g., high, increase, line)

  • A document containing such a term is more likely to be relevant than a document that doesn’t

  • But it’s not a sure indicator of relevance.

  • → For frequent terms, we want high positive weights for words like high, increase, and line

  • But lower weights than for rare terms.

  • We will use document frequency (df) to capture this.


Speaker notes:

Content Tools

Sources

There are currently no sources for this slide.