Using patterns co-occurrence matrix for cleaning closed sequential patterns for text mining

Albathan, Mubarak, Li, Yuefeng, & Algarni, Abdulmohsen (2012) Using patterns co-occurrence matrix for cleaning closed sequential patterns for text mining. In Zhong, Ning & Gong, Zhiguo (Eds.) 2012 IEEE/WIC/ACM International Conference on Web Intelligence, IEEE, Macau, China, pp. 201-205.

View at publisher

Abstract

With the overwhelming increase in the amount of texts on the web, it is almost impossible for people to keep abreast of up-to-date information. Text mining is a process by which interesting information is derived from text through the discovery of patterns and trends. Text mining algorithms are used to guarantee the quality of extracted knowledge. However, the extracted patterns using text or data mining algorithms or methods leads to noisy patterns and inconsistency. Thus, different challenges arise, such as the question of how to understand these patterns, whether the model that has been used is suitable, and if all the patterns that have been extracted are relevant. Furthermore, the research raises the question of how to give a correct weight to the extracted knowledge. To address these issues, this paper presents a text post-processing method, which uses a pattern co-occurrence matrix to find the relation between extracted patterns in order to reduce noisy patterns. The main objective of this paper is not only reducing the number of closed sequential patterns, but also improving the performance of pattern mining as well. The experimental results on Reuters Corpus Volume 1 data collection and TREC filtering topics show that the proposed method is promising.

Impact and interest:

2 citations in Scopus
Search Google Scholar™

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

113 since deposited on 14 Mar 2013
6 in the past twelve months

Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

ID Code: 58289
Item Type: Conference Paper
Refereed: Yes
ISBN: 9780769548807
Subjects: Australian and New Zealand Standard Research Classification > ENGINEERING (090000)
Divisions: Current > QUT Faculties and Divisions > Science & Engineering Faculty
Copyright Owner: Copyright 2012 IEEE
Copyright Statement: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.
Deposited On: 14 Mar 2013 23:20
Last Modified: 02 Jul 2017 08:31

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page