Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Complications: Format/language

  • Documents being indexed can include docs from many different languages

    • A single index may have to contain terms of several languages.

  • Sometimes a document or its components can contain multiple languages/formats

    • French email with a German pdf attachment.

  • What is a unit document?

    • A file?

    • An email? (Perhaps one of many in an mbox.)

    • An email with 5 attachments?

    • A group of files (PPT or LaTeX as HTML pages)

Speaker notes:

Content Tools


There are currently no sources for this slide.