Comparing Attribute Selection Measures

  • The three measures, in general, return good results but
    • Information gain:
      • biased towards multivalued attributes
    • Gain ratio:
      • tends to prefer unbalanced splits in which one partition is much smaller than the others
    • Gini index:
      • biased to multivalued attributes
      • has difficulty when # of classes is large
      • tends to favor tests that result in equal-sized partitions and purity in both partitions

