Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Evaluating Categorization

  • Evaluation must be done on test data that are independent of the training data (usually a disjoint set of instances).

    • Sometimes use cross-validation (averaging results over multiple training and test splits of the overall data)

  • It’s easy to get good performance on a test set that was available to the learner during training (e.g., just memorize the test set).

  • Measures: precision, recall, F1, classification accuracy

  • Classification accuracy: c/n where n is the total number of test instances and c is the number of test instances correctly classified by the system.

    • Adequate if one class per document

    • Otherwise F measure for each class

Speaker notes:

Content Tools


There are currently no sources for this slide.