简介:吴月华,加拿大约克大学(York University)统计系教授;1989年获得美国匹兹堡大学统计学博士学位,师从世界著名统计学家C.R. Rao。吴教授研究领域包括金融统计、空间统计、高维数据统计、变点检验以及在环境科学等交叉学科。目前当选国际统计学会的会员(Elected member of International Statistical Institute),承担多项加拿大政府重要科研项目,发表学术论文百余篇,其中包括5篇国际最顶级期刊Proceedings of the National Academy of Sciences of the United States of America(PNAS,美国国家科学院院刊)论文。
报告题目:Association rule mining and market basket analysis
教授观点:Current algorithms for association rule mining from transaction data are mostly deterministic and enumerative. They can be computationally intractable even for mining a dataset containing just a few hundred transaction items, if no action is taken to constrain the search space. In this talk, we first briefly review the Apriori algorithm, and then introduce a Gibbs-sampling-induced stochastic search procedure to randomly sample association rules from the itemset space, and perform rule mining from the reduced transaction dataset generated by the sample. A general rule importance measure is also proposed to direct the stochastic search so that, as a result of the randomly generated association rules constituting an ergodic Markov chain, the overall most important rules in the itemset space can be uncovered from the reduced dataset with probability 1 in the limit. We end the talk by presenting some data examples.