Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Feature Selection: Why?

  • Text collections have a large number of features

    • 10,000 – 1,000,000 unique words … and more

  • May make using a particular classifier feasible

    • Some classifiers can’t deal with 1,000,000 features

  • Reduces training time

    • Training time for some methods is quadratic or worse in the number of features

  • Makes runtime models smaller and faster

  • Can improve generalization (performance)

    • Eliminates noise features

    • Avoids overfitting


Speaker notes:

Content Tools

Sources

There are currently no sources for this slide.