Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.
Frequent terms are less informative than rare terms
Consider a query term that is frequent in the collection (e.g., high, increase, line)
A document containing such a term is more likely to be relevant than a document that doesn’t
But it’s not a sure indicator of relevance.
→ For frequent terms, we want high positive weights for words like high, increase, and line
But lower weights than for rare terms.
We will use document frequency (df) to capture this.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License