Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Bigram (k-gram) indexes

  • Enumerate all k-grams (sequence of k chars) occurring in any term

  • e.g., from text “April is the cruelest month” we get the 2-grams (bigrams)

    • $ is a special word boundary symbol

  • Maintain a second inverted index from bigrams to dictionary terms that match each bigram.

  • $a,ap,pr,ri,il,l$,$i,is,s$,$t,th,he,e$,$c,cr,ru,

    ue,el,le,es,st,t$, $m,mo,on,nt,h$


Speaker notes:

Content Tools

Sources

There are currently no sources for this slide.