The ranking based constrained document clustering method and its application to social event detection

Sutanto, Taufik & Nayak, Richi (2014) The ranking based constrained document clustering method and its application to social event detection. Lecture Notes in Computer Science, 8422, pp. 47-60.

View at publisher

Abstract

With the growing size and variety of social media files on the web, it’s becoming critical to efficiently organize them into clusters for further processing. This paper presents a novel scalable constrained document clustering method that harnesses the power of search engines capable of dealing with large text data. Instead of calculating distance between the documents and all of the clusters’ centroids, a neighborhood of best cluster candidates is chosen using a document ranking scheme. To make the method faster and less memory dependable, the in-memory and in-database processing are combined in a semi-incremental manner. This method has been extensively tested in the social event detection application. Empirical analysis shows that the proposed method is efficient both in computation and memory usage while producing notable accuracy.

Impact and interest:

5 citations in Scopus
Search Google Scholar™
2 citations in Web of Science®

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

ID Code: 71606
Item Type: Journal Article
Refereed: Yes
Additional Information: Database Systems for Advanced Applications: 19th International Conference, DASFAA 2014, Bali, Indonesia, April 21-24, 2014. Proceedings, Part II. Print ISBN 978-3-319-05812-2
Keywords: constrained clustering, ranking, social event detection
DOI: 10.1007/978-3-319-05813-9_4
ISBN: 9783319058139
ISSN: 0302-9743
Subjects: Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > ARTIFICIAL INTELLIGENCE AND IMAGE PROCESSING (080100) > Pattern Recognition and Data Mining (080109)
Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > INFORMATION SYSTEMS (080600) > Database Management (080604)
Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > LIBRARY AND INFORMATION STUDIES (080700) > Information Retrieval and Web Search (080704)
Divisions: Current > Schools > School of Electrical Engineering & Computer Science
Past > Institutes > Institute for Creative Industries and Innovation
Current > QUT Faculties and Divisions > Science & Engineering Faculty
Copyright Owner: Copyright 2014 Springer International Publishing Switzerland
Deposited On: 14 May 2014 23:28
Last Modified: 27 Nov 2015 01:05

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page