Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Classification of Class-Imbalanced Data Sets

  • Class-imbalance problem: Rare positive example but numerous negative ones, e.g., medical diagnosis, fraud, oil-spill, fault, etc.
  • Traditional methods assume a balanced distribution of classes and equal error costs: not suitable for class-imbalanced data
  • Typical methods for imbalance data in 2-class classification:
    • Oversampling: re-sampling of data from positive class
    • Under-sampling: randomly eliminate tuples from negative class
    • Threshold-moving: moves the decision threshold, t, so that the rare class tuples are easier to classify, and hence, less chance of costly false negative errors
    • Ensemble techniques: Ensemble multiple classifiers introduced above
  • Still difficult for class imbalance problem on multiclass tasks

Speaker notes:

Content Tools


There are currently no sources for this slide.