Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Insufficient data

  • Zero probabilities spell disaster

    • We need to smooth probabilities

      • Discount nonzero probabilities

      • Give some probability mass to unseen things

  • There’s a wide space of approaches to smoothing probability distributions to deal with this problem, such as adding 1, ½ or  to counts, Dirichlet priors, discounting, and interpolation

    • [See FSNLP ch. 6 or CS224N if you want more]

  • A simple idea that works well in practice is to use a mixture between the document multinomial and the collection multinomial distribution


Speaker notes:

Content Tools

Sources

There are currently no sources for this slide.