SPIMI: Single-pass in-memory indexing
Key idea 1: Generate separate dictionaries for each block – no need to maintain term-termID mapping across blocks.
Key idea 2: Don’t sort. Accumulate postings in postings lists as they occur.
With these two ideas we can generate a complete inverted index for each block.
These separate indexes can then be merged into one big index.
Merging of blocks is analogous to BSBI.
Compression makes SPIMI even more efficient.
Compression of terms
Compression of postings
See next lecture