Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Active Learning

  • Class labels are expensive to obtain
  • Active learner: query human (oracle) for labels
  • Pool-based approach: Uses a pool of unlabeled data
    • L: a small subset of D is labeled, U: a pool of unlabeled data in D
    • Use a query function to carefully select one or more tuples from U and request labels from an oracle (a human annotator)
    • The newly labeled samples are added to L, and learn a model
    • Goal: Achieve high accuracy using as few labeled data as possible
  • Evaluated using learning curves: Accuracy as a function of the number of instances queried (# of tuples to be queried should be small)
  • Research issue: How to choose the data tuples to be queried?
    • Uncertainty sampling: choose the least certain ones
    • Reduce version space, the subset of hypotheses consistent w. the training data
    • Reduce expected entropy over U: Find the greatest reduction in the total number of incorrect predictions

Speaker notes:

Content Tools


There are currently no sources for this slide.