Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Classification—A Two-Step Process

  • Model construction: describing a set of predetermined classes
    • Each tuple/sample is assumed to belong to a predefined class, as determined by the class label attribute
    • The set of tuples used for model construction is training set
    • The model is represented as classification rules, decision trees, or mathematical formulae
  • Model usage: for classifying future or unknown objects
    • Estimate accuracy of the model
      • The known label of test sample is compared with the classified result from the model
      • Accuracy rate is the percentage of test set samples that are correctly classified by the model
      • Test set is independent of training set (otherwise overfitting)
    • If the accuracy is acceptable, use the model to classify new data
  • Note: If the test set is used to select models, it is called validation (test) set

Speaker notes:

Content Tools

Sources

There are currently no sources for this slide.