Evaluation of relevance feedback strategies

  • Use q0 and compute precision and recall graph

  • Use qm and compute precision recall graph

    • Assess on all documents in the collection

        • Spectacular improvements, but … it’s cheating!

        • Partly due to known relevant documents ranked higher

        • Must evaluate with respect to documents not seen by user

    • Use documents in residual collection (set of documents minus those assessed relevant)

        • Measures usually then lower than for original query

        • But a more realistic evaluation

        • Relative performance can be validly compared

  • Empirically, one round of relevance feedback is often very useful. Two rounds is sometimes marginally useful.

