Basic Principles of Attribute-Oriented Induction

  • Data focusing: task-relevant data, including dimensions, and the result is the initial relation
  • Attribute-removal: remove attribute A if there is a large set of distinct values for A but (1) there is no generalization operator on A, or (2) A’s higher level concepts are expressed in terms of other attributes
  • Attribute-generalization: If there is a large set of distinct values for A, and there exists a set of generalization operators on A, then select an operator and generalize A
  • Attribute-threshold control: typical 2-8, specified/default
  • Generalized relation threshold control: control the final relation/rule size

