Scaling FP-growth by Database Projection

  • What about if FP-tree cannot fit in memory?
    • DB projection
  • First partition a database into a set of projected DBs
  • Then construct and mine FP-tree for each projected DB
  • Parallel projection vs. partition projection techniques
    • Parallel projection
      • Project the DB in parallel for each frequent item
      • Parallel projection is space costly
      • All the partitions can be processed in parallel
    • Partition projection
      • Partition the DB based on the ordered frequent items
      • Passing the unprocessed parts to the subsequent partitions

