Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Avoiding Overfitting the Data

  • Definition: Given a hypothesis space H, a hypothesis h є H is said to overfit the training data if there exists some alternative hypothesis h’ є H, such that h’ has smaller error than h’ over the training examples, but h’ has a smaller error than h over the entire distribution of instances. [1]
  • ID3 grows each branch of the tree just deeply enough to perfectly classify the training examples.
    • This can lead to difficulties when there is noise in the data or when the number of training examples is too small to produce a representative sample of the true target function.
    • Overfitting trees could be produced!
  
 
  • Impact of overfitting in a typical application of decision tree learning [1]

Speaker notes:

Content Tools

Sources

There are currently no sources for this slide.