kNN vs. Naive Bayes

  • Bias/Variance tradeoff

    • Variance ≈ Capacity

  • kNN has high variance and low bias.

    • Infinite memory

  • NB has low variance and high bias.

    • Decision surface has to be linear (hyperplane – see later)

  • Consider asking a botanist: Is an object a tree?

    • Too much capacity/variance, low bias

      • Botanist who memorizes

      • Will always say “no” to new object (e.g., different # of leaves)

    • Not enough capacity/variance, high bias

      • Lazy botanist

      • Says “yes” if the object is green

    • You want the middle ground

                                                                                (Example due to C. Burges)

