General issues in spell correction

  • We enumerate multiple alternatives for “Did you mean?”

  • Need to figure out which to present to the user

    • The alternative hitting most docs

    • Query log analysis

  • More generally, rank alternatives probabilistically

    • argmaxcorr P(corr | query)

    • From Bayes rule, this is equivalent to
      argmaxcorr P(query | corr) * P(corr)

                                ↑                 ↑

                    Noisy channel     Language model

