Enhancing an incremental clustering algorithm for web page collections

& (2009) Enhancing an incremental clustering algorithm for web page collections. In Lim, E, Pasi, G, Berendt, B, Bertino, E, & Baeza-Yates, R (Eds.) Proceedings of the 2009 IEEE/WIC/ACM International Conference on Web Intelligence. IEEE Computer Society Conference Publishing Services, United States, pp. 1-4.

View at publisher

Description

With the size and state of the Internet today, a good quality approach to organizing this mass of information is of great importance. Clustering web pages into groups of similar documents is one approach, but relies heavily on good feature extraction and document representation as well as a good clustering approach and algorithm. Due to the changing nature of the Internet, resulting in a dynamic dataset, an incremental approach is preferred. In this work we propose an enhanced incremental clustering approach to develop a better clustering algorithm that can help to better organize the information available on the Internet in an incremental fashion. Experiments show that the enhanced algorithm outperforms the original histogram based algorithm by up to 7.5%.

Impact and interest:

4 citations in Scopus
3 citations in Web of Science®
Search Google Scholar™

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

429 since deposited on 19 Jan 2010
27 in the past twelve months

Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

ID Code: 29739
Item Type: Chapter in Book, Report or Conference volume (Conference contribution)
ORCID iD:
Xu, Yueorcid.org/0000-0002-1137-0272
Measurements or Duration: 4 pages
Keywords: Incremental Clustering, Web
DOI: 10.1109/WI-IAT.2009.236
ISBN: 978-0-7695-3801-3
Pure ID: 31895735
Divisions: Past > QUT Faculties & Divisions > Faculty of Science and Technology
Past > QUT Faculties & Divisions > Science & Engineering Faculty
Current > Research Centres > Australian Research Centre for Aerospace Automation
Copyright Owner: Consult author(s) regarding copyright matters
Copyright Statement: This work is covered by copyright. Unless the document is being made available under a Creative Commons Licence, you must assume that re-use is limited to personal use and that permission from the copyright owner must be obtained for all other uses. If the document is available under a Creative Commons License (or other specified license) then refer to the Licence for details of permitted re-use. It is a condition of access that users recognise and abide by the legal requirements associated with these rights. If you believe that this work infringes copyright please provide details by email to qut.copyright@qut.edu.au
Deposited On: 19 Jan 2010 23:12
Last Modified: 03 Mar 2024 09:55