Current Slide
Speaker notes:
Content Tools
Sources
There are currently no sources for this slide.
Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.
Consider N = 1 million documents, each with about 1000 words.
Avg 6 bytes/word including spaces/punctuation
6GB of data in the documents.
Say there are M = 500K distinct terms among these.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License