Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Using Language Models in IR

  • Treat each document as the basis for a model (e.g., unigram sufficient statistics)

  • Rank document d based on P(d | q)

  • P(d | q) = P(q | d) x P(d) / P(q)

    • P(q) is the same for all documents, so ignore

    • P(d) [the prior] is often treated as the same for all d

      • But we could use criteria like authority, length, genre

    • P(q | d) is the probability of q given d’s model

  • Very general formal approach


Speaker notes:

Content Tools

Sources

There are currently no sources for this slide.