Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Overfitting and Tree Pruning

  • Overfitting: An induced tree may overfit the training data
    • Too many branches, some may reflect anomalies due to noise or outliers
    • Poor accuracy for unseen samples
  • Two approaches to avoid overfitting
    • Prepruning: Halt tree construction early ̵ do not split a node if this would result in the goodness measure falling below a threshold
      • Difficult to choose an appropriate threshold
    • Postpruning: Remove branches from a “fully grown” tree—get a sequence of progressively pruned trees
      • Use a set of data different from the training data to decide which is the “best pruned tree”

Speaker notes:

Content Tools

Sources

There are currently no sources for this slide.