QUT ePrints

Concise representations for association rules in multi-level datasets

Xu, Yue, Shaw, Gavin, & Li, Yuefeng (2009) Concise representations for association rules in multi-level datasets. Journal of Systems Science and Systems Engineering, 18(1), pp. 53-70.

View at publisher

Abstract

Association rule mining has made many advances in the area of knowledge discovery. However, the quality of the discovered association rules is a big concern and has drawn more and more attention recently. One problem with the quality of the discovered association rules is the huge size of the extracted rule set. Often for a dataset, a huge number of rules can be extracted, but many of them can be redundant to other rules and thus useless in practice. Mining non-redundant rules is a promising approach to solve this problem. In this paper, we firstly propose a definition for redundancy; then we propose a concise representation called Reliable basis for representing non-redundant association rules for both exact rules and approximate rules. An important contribution of this paper is that we propose to use the certainty factor as the criteria to measure the strength of the discovered association rules. With the criteria, we can determine the boundary between redundancy and non-redundancy to ensure eliminating as many redundant rules as possible without reducing the inference capacity of and the belief to the remaining extracted non-redundant rules. We prove that the redundancy elimination based on the proposed Reliable basis does not reduce the belief to the extracted rules. We also prove that all association rules can be deduced from the Reliable basis. Therefore the Reliable basis is a lossless representation of association rules. Experimental results show that the proposed Reliable basis can significantly reduce the number of extracted rules.

Impact and interest:

0 citations in Scopus
Search Google Scholar™
1 citations in Web of Science®

Citation countsare sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

174 since deposited on 18 Jan 2010
62 in the past twelve months

Full-text downloadsdisplays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

ID Code: 29770
Item Type: Journal Article
Additional URLs:
Keywords: Association rule mining, redundant association rules, closed itemsets, multi-level datasets
DOI: 10.1007/s11518-009-5098-x
ISSN: 1004-3756
Subjects: Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > LIBRARY AND INFORMATION STUDIES (080700)
Divisions: Past > QUT Faculties & Divisions > Faculty of Science and Technology
Past > Schools > School of Information Technology
Copyright Owner: Copyright 2009 Springer
Copyright Statement: The original publication is available at SpringerLink http://www.springerlink.com
Deposited On: 19 Jan 2010 07:47
Last Modified: 01 Mar 2012 00:09

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page