Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Estimation – key challenge

  • If non-relevant documents are approximated by the whole collection, then ri (prob. of occurrence in non-relevant documents for query) is n/N and

    • log (1– ri)/ri = log (N– n)/n ≈ log N/n = IDF!

  • pi (probability of occurrence in relevant documents) can be estimated in various ways:

    • from relevant documents if know some

      • Relevance weighting can be used in feedback loop

    • constant (Croft and Harper combination match) – then just get idf weighting of terms

    • proportional to prob. of occurrence in collection

      • more accurately, to log of this (Greiff, SIGIR 1998) 

Speaker notes:

Content Tools


There are currently no sources for this slide.