Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Summary

  • Classification is a form of data analysis that extracts models describing important data classes.
  • Effective and scalable methods have been developed for decision tree induction, Naive Bayesian classification, rule-based classification, and many other classification methods.
  • Evaluation metrics include: accuracy, sensitivity, specificity, precision, recall, F measure, and measure.
  • Stratified k-fold cross-validation is recommended for accuracy estimation. Bagging and boosting can be used to increase overall accuracy by learning and combining a series of individual models.
  • Significance tests and ROC curves are useful for model selection.
  • There have been numerous comparisons of the different classification methods; the matter remains a research topic
  • No single method has been found to be superior over all others for all data sets
  • Issues such as accuracy, training time, robustness, scalability, and interpretability must be considered and can involve trade-offs, further complicating the quest for an overall superior method


Speaker notes:

Content Tools

Sources

There are currently no sources for this slide.