  • Graphs are good, but people want summary measures!

    • Precision at fixed retrieval level

      • Precision-at-k: Precision of top k results

      • Perhaps appropriate for most of web search: all people want are good matches on the first one or two results pages

      • But: averages badly and has an arbitrary parameter of k

    • 11-point interpolated average precision

      • The standard measure in the early TREC competitions: you take the precision at 11 levels of recall varying from 0 to 1 by tenths of the documents, using interpolation (the value for 0 is always interpolated!), and average them

      • Evaluates performance at all recall levels

