Semantic labelling for document feature patterns using ontological subjects

Tao, Xiaohui, Li, Yuefeng, Liu, Bin, & Shen, Yan (2012) Semantic labelling for document feature patterns using ontological subjects. In Zhong, Ning & Gong, Zhiguo (Eds.) 2012 IEEE/WIC/ACM International Conference on Web Intelligence, IEEE Computer Society Conference Publishing Services (CPS), Macau, China, pp. 530-534.

View at publisher

Abstract

Finding and labelling semantic features patterns of documents in a large, spatial corpus is a challenging problem. Text documents have characteristics that make semantic labelling difficult; the rapidly increasing volume of online documents makes a bottleneck in finding meaningful textual patterns. Aiming to deal with these issues, we propose an unsupervised documnent labelling approach based on semantic content and feature patterns. A world ontology with extensive topic coverage is exploited to supply controlled, structured subjects for labelling. An algorithm is also introduced to reduce dimensionality based on the study of ontological structure. The proposed approach was promisingly evaluated by compared with typical machine learning methods including SVMs, Rocchio, and kNN.

Impact and interest:

0 citations in Scopus
Search Google Scholar™

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

ID Code: 58294
Item Type: Conference Paper
Refereed: Yes
Keywords: Text classification, Ontology Learning, Feature selection
DOI: 10.1109/WI-IAT.2012.47
ISBN: 9780769548807
Subjects: Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > ARTIFICIAL INTELLIGENCE AND IMAGE PROCESSING (080100) > Pattern Recognition and Data Mining (080109)
Divisions: Current > Schools > School of Electrical Engineering & Computer Science
Current > QUT Faculties and Divisions > Science & Engineering Faculty
Copyright Owner: Copyright 2012 IEEE
Deposited On: 15 Mar 2013 00:59
Last Modified: 22 Oct 2013 22:33

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page