Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Bigger collections

  • Consider N = 1 million documents, each with about 1000 words.

  • Avg 6 bytes/word including spaces/punctuation

    • 6GB of data in the documents.

  • Say there are M = 500K distinct terms among these.


Speaker notes:

Content Tools

Sources

There are currently no sources for this slide.