  • Three types of attributes
    • Nominal—values from an unordered set, e.g., color, profession
    • Ordinal—values from an ordered set, e.g., military or academic rank
    • Numeric—real numbers, e.g., integer or real numbers
  • Discretization: Divide the range of a continuous attribute into intervals
    • Interval labels can then be used to replace actual data values
    • Reduce data size by discretization
    • Supervised vs. unsupervised
    • Split (top-down) vs. merge (bottom-up)
    • Discretization can be performed recursively on an attribute
    • Prepare for further analysis, e.g., classification

