Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Unstructured data in 1680

  • Which plays of Shakespeare contain the words Brutus AND Caesar but NOT Calpurnia?
  • One could grep all of Shakespeare’s plays for Brutus and Caesar, then strip out lines containing Calpurnia?
  • Why is that not the answer?
    • Slow (for large corpora)
    • NOT Calpurnia is non-trivial
    • Other operations (e.g., find the word Romans near countrymen) not feasible
    • Ranked retrieval (best documents to return)
      • Later lectures

Speaker notes:

Content Tools


There are currently no sources for this slide.