Clustering XML Documents Using Closed Frequent Subtrees: A Structural Similarity Approach
Kutty, Sangeetha, Tran, Tien, Nayak, Richi, & Li, Yuefeng (2008) Clustering XML Documents Using Closed Frequent Subtrees: A Structural Similarity Approach. In Fuhr, Norber, Kamps, Jaap, Lalmas, Mounia, Malik, Saadia, & Trotman, Andrew (Eds.) 6th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2007 Dagstuhl Castle, December 17-19, 2007, Germany.
This paper presents the experimental study conducted over the INEX 2007 Document Mining Challenge corpus employing a frequent subtree-based incremental clustering approach. Using the structural information of the XML documents, the closed frequent subtrees are generated. A matrix is then developed representing the closed frequent subtree distribution in documents. This matrix is used to progressively cluster the XML documents. In spite of the large number of documents in INEX 2007 Wikipedia dataset, the proposed frequent subtree-based incremental clustering approach was successful in clustering the documents.
Impact and interest:
Citation counts are sourced monthly from and citation databases.
These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.
Citations counts from theindexing service can be viewed at the linked Google Scholar™ search.
Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.
|Item Type:||Conference Paper|
|Additional Information:||For more information, please refer to the journal’s/conference website (see hypertext link) or contact the author.|
|Divisions:||Past > QUT Faculties & Divisions > Faculty of Science and Technology|
|Copyright Owner:||Copyright 2008 Springer|
|Deposited On:||26 Feb 2009 05:45|
|Last Modified:||09 Jun 2010 13:27|
Repository Staff Only: item control page