Defining Negative Correlated Patterns (I)

  • Definition 1 (support-based)
    • If itemsets X and Y are both frequent but rarely occur together, i.e.,
      sup(X U Y) < sup (X) * sup(Y)
    • Then X and Y are negatively correlated
  • Problem: A store sold two needle 100 packages A and B, only one transaction containing both A and B.
    • When there are in total 200 transactions, we have
      s(A U B) = 0.005, s(A) * s(B) = 0.25, s(A U B) < s(A) * s(B)
    • When there are 105 transactions, we have
      s(A U B) = 1/105, s(A) * s(B) = 1/103 * 1/103, s(A U B) > s(A) * s(B)
    • Where is the problem? —Null transactions, i.e., the support-based definition is not null-invariant!

