NLPX : a natural language query interface for facilitating user-oriented XML-IR

Woodley, Alan Paul (2008) NLPX : a natural language query interface for facilitating user-oriented XML-IR. PhD thesis, Queensland University of Technology.


Most information retrieval (IR) systems respond to users' representation of their information needs (queries) with a ranked list of relevant results, usually text documents. XML documents di er from traditional text documents by explicitly separating structure and content. XML-IR systems aim to exploit this separation by searching and retrieving relevant components of documents (called elements) rather than entire documents thereby, better ful lling users' information needs. Despite the potential bene t of XML-IR systems, most research in this area has not been centered on the needs of users. In particular, current XML-IR query formation interfaces, namely keywords-only and formal language, are not able to optimally address the needs of users. Keywords-only interfaces are too unsophisticated to fully capture the users' complex information needs that contain both content and structural requirements. In contrast, while formal languages are able to capture users' content and structural requirements they are too di cult to use, even for experts, and are too closely tied to the physical structure of the collection. This thesis presents a solution to these problems by presenting NLPX, a natural language interface for XML-IR systems. NLPX allows users to enter XML-IR queries in natural language and translates them into a formal language (NEXI) to be processed by existing XML retrieval systems. When evaluated by system testing, NLPX outperformed alternative translation approaches. When tested in a user-based experiment, NLPX performed comparably to a query-by-template interface, the baseline user-oriented interface for formulating structured queries. It is hoped that the outcomes of this thesis will help to refocus the eld of XML-IR around the user. This will lead to the development of more useful XML-IR systems, which will hopefully result in the more widespread use of XML-IR systems.

Impact and interest:

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

740 since deposited on 03 Dec 2008
33 in the past twelve months

Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

ID Code: 16642
Item Type: QUT Thesis (PhD)
Supervisor: Geva, Shlomo & Nayak, Richi
Keywords: information retrieval systems, information needs, XML-IR systems, NEXI, NLPX
Divisions: Past > QUT Faculties & Divisions > Faculty of Science and Technology
Past > Schools > School of Software Engineering & Data Communications
Department: Faculty of Information Technology
Institution: Queensland University of Technology
Copyright Owner: Copyright Alan Paul Woodley
Deposited On: 03 Dec 2008 04:07
Last Modified: 11 Aug 2014 22:33

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page