Constraint-Based Pattern Mining


Why do we use constraint-based pattern mining? Because we’d like to apply different pruning methods to constrain pattern mining process.

And for those reasons:

  • Finding all the patterns in a dataset autonomously? — unrealistic!

    • Too many patterns but not necessarily user-interested!
  • Pattern mining should be an interactive process
    • User directs what to be mined using a data mining query language (or a graphical user interface)
  • Constraint-based mining
    • User flexibility: provides constraints on what to be mined
    • Optimization: explores such constraints for efficient mining
      • Constraint-based mining: Constraint-pushing, similar to push selection first in DB query processing

Constraints in General Data Mining

A data mining query can be in the form of a meta-rule or with the following language primitives

* Knowledge type constraint:

* Ex.: classification, association, clustering, outlier finding, ….

* Data constraint — using SQL-like queries

* Ex.: find products sold together in NY stores this year

* Dimension/level constraint

* Ex.: in relevance to region, price, brand, customer category

* Rule (or pattern) constraint

* Ex.: small sales (price < $10) triggers big sales (sum > $200)

* Interestingness constraint

* Ex.: strong rules: min_sup ≤ 0.02, min_conf ≥ 0.6, min_correlation ≥ 0.7

Different Kinds of Constraints: Different Pruning Methods

  • Constraints can be categorized as

    • Pattern space pruning constraints vs. data space pruning constraints
  • Pattern space pruning constraints
    • Anti-monotonic: If constraint c is violated, its further mining can be terminated
    • Monotonic: If c is satisfied, no need to check c again
    • Succinct: if the constraint c can be enforced by directly manipulating the data
    • Convertible: c can be converted to monotonic or anti-monotonic if items can be properly ordered in processing
  • Data space pruning constraints
    • Data succinct: Data space can be pruned at the initial pattern mining process
    • Data anti-monotonic: If a transaction t does not satisfy c, then t can be pruned to reduce data processing effort.

Pattern Anti-monotonicity


这里因为随着item的增多,itemset S的support会逐渐减小,所以ex4的答案是yes

Pattern Monotonicity

Data Anti-monotonicity

Succinct Constraints

Convertible Constraints



注意我们都将按照right order进行pattern generation

