Using extended random set to find specific patterns

Albathan, Mubarak, Li, Yuefeng, & Xu, Yue (2014) Using extended random set to find specific patterns. In Skowron, Andrzej, Dey, Lipika, Krasuski, Adam, & Li, Yuefeng (Eds.) Proceedings of the 2014 IEEE/WIC/ACM International Joint Conference on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) - Volume 2, IEEE, Warsaw, Poland, pp. 30-37.

View at publisher


With the overwhelming increase in the amount of data on the web and data bases, many text mining techniques have been proposed for mining useful patterns in text documents. Extracting closed sequential patterns using the Pattern Taxonomy Model (PTM) is one of the pruning methods to remove noisy, inconsistent, and redundant patterns. However, PTM model treats each extracted pattern as whole without considering included terms, which could affect the quality of extracted patterns. This paper propose an innovative and effective method that extends the random set to accurately weigh patterns based on their distribution in the documents and their terms distribution in patterns. Then, the proposed approach will find the specific closed sequential patterns (SCSP) based on the new calculated weight. The experimental results on Reuters Corpus Volume 1 (RCV1) data collection and TREC topics show that the proposed method significantly outperforms other state-of-the-art methods in different popular measures.

Impact and interest:

2 citations in Scopus
Search Google Scholar™

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

ID Code: 80081
Item Type: Conference Paper
Refereed: Yes
Keywords: Extended random set, Information retrieval, Select top-k patterns, Specific closed sequential patterns, Text mining
DOI: 10.1109/WI-IAT.2014.77
ISBN: 9781479941438
Divisions: Current > Schools > School of Electrical Engineering & Computer Science
Current > QUT Faculties and Divisions > Science & Engineering Faculty
Copyright Owner: Copyright 2014 by IEEE
Deposited On: 15 Jan 2015 02:29
Last Modified: 16 Jan 2015 00:03

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page