Issues with main and auxiliary indexes

  • Problem of frequent merges – you touch stuff a lot

  • Poor performance during merge

  • Actually:

    • Merging of the auxiliary index into the main index is efficient if we keep a separate file for each postings list.

    • Merge is the same as a simple append.

    • But then we would need a lot of files – inefficient for OS.

  • Assumption for the rest of the lecture: The index is one big file.

  • In reality: Use a scheme somewhere in between (e.g., split very large postings lists, collect postings lists of length 1 in one file etc.)

