Classification Is to Derive the Maximum Posteriori

  • Let D be a training set of tuples and their associated class labels, and each tuple is represented by an n-D attribute vector X = (x1, x2, …, xn)
  • Suppose there are m classes C1, C2, …, Cm.
  • Classification is to derive the maximum posteriori, i.e., the maximal P(Ci|X)
  • This can be derived from Bayes’ theorem


  • Since P(X) is constant for all classes, only 


needs to be maximized

