Document clustering algorithms, representations and evaluation for information retrieval

De Vries, Christopher M. (2014) Document clustering algorithms, representations and evaluation for information retrieval. PhD by Publication, Queensland University of Technology.

Abstract

This thesis presents new methods for classification and thematic grouping of billions of web pages, at scales previously not achievable. This process is also known as document clustering, where similar documents are automatically associated with clusters that represent various distinct topic. These automatically discovered topics are in turn used to improve search engine performance by only searching the topics that are deemed relevant to particular user queries.

Impact and interest:

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

536 since deposited on 18 Sep 2014
191 in the past twelve months

Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

ID Code: 75862
Item Type: QUT Thesis (PhD by Publication)
Supervisor: Geva, Shlomo & Trotman, Andrew
Keywords: document clustering, representations, evaluation, information retrieval, algorithms, clustering, hashing, signatures, efficiency, machine learning
Divisions: Current > Schools > School of Electrical Engineering & Computer Science
Current > QUT Faculties and Divisions > Science & Engineering Faculty
Institution: Queensland University of Technology
Deposited On: 18 Sep 2014 23:39
Last Modified: 03 Sep 2015 06:00

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page