Tomasz Imielinski - Data Mining - Association Rules: Twenty Years and Beyond : Center for Science of Information

Tomasz Imielinski - Data Mining - Association Rules: Twenty Years and Beyond
Monday, October 22, 2012 2:30 PM - 3:30 PM EDT
LWSN 1142
Purdue University

Association rules and Frequent ItemSets were introduced by Agrawal, Imielinski and Swami in their 1993 ACM SIGMOD paper (which ten years later won Sigmod Test of Time award). In that paper and in subsequent work, the purpose of data mining was defined in database terms - generate massive number of rules from the underlying data to discover the "unexpected" rather than confirm the given hypothesis. Since 1993 Sigmod paper thousands of papers have been published in leading database and machine learning conferences and journals on the subject of fast association rules and frequent itemsets generation, rule filtering and ranking by statistical significance as well as different types of rules. Today, Association Rules are used in wide range of applications from retail and e-commerce to finance and computational biology. All major data analysis software packages such as SAS, Oracle, IBM and Microsoft SQL Server support now Association rules and provide implementations of variants of Apriori algorithm for fast frequent itemset generation.

I will provide general overview of the nearly twenty years of work (with a bit of personal perspective) and discuss new challenges and opportunities for further work on Association Rules in the age of "Big Data".