Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Classification in Large Databases

  • Classification—a classical problem extensively studied by statisticians and machine learning researchers
  • Scalability: Classifying data sets with millions of examples and hundreds of attributes with reasonable speed
  • Why is decision tree induction popular?
    • relatively faster learning speed (than other classification methods)
    • convertible to simple and easy to understand classification rules
    • can use SQL queries for accessing databases
    • comparable classification accuracy with other methods
  • RainForest (VLDB’98 — Gehrke, Ramakrishnan & Ganti)
    • Builds an AVC-list (attribute, value, class label)

Speaker notes:

Content Tools

Sources

There are currently no sources for this slide.