Predicting query reformulation during web searching

Jansen, Bernard J., Booth, Danielle L., & Spink, Amanda H. (2009) Predicting query reformulation during web searching. In Olsen, Dan R. Jr., Arthur, Richard B., Hinckley, Ken , Morris , Meredith R., Hudson, Scott , & Greenberg, Saul (Eds.) Proceedings of the 27th Annual CHI Conference on Human Factors in Computing Systems, Association for Computing Machinery Press, Boston, pp. 3907-3912.

View at publisher


This paper reports results from a study in which we automatically classified the query reformulation patterns for 964,780 Web searching sessions (composed of 1,523,072 queries) in order to predict what the next query reformulation would be. We employed an n-gram modeling approach to describe the probability of searchers transitioning from one query reformulation state to another and predict their next state. We developed first, second, third, and fourth order models and evaluated each model for accuracy of prediction. Findings show that Reformulation and Assistance account for approximately 45 percent of all query reformulations. Searchers seem to seek system searching assistant early in the session or after a content change. The results of our evaluations show that the first and second order models provided the best predictability, between 28 and 40 percent overall, and higher than 70 percent for some patterns. Implications are that the n-gram approach can be used for improving searching systems and searching assistance in real time.

Impact and interest:

2 citations in Scopus
Search Google Scholar™

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

ID Code: 28915
Item Type: Conference Paper
Refereed: Yes
Additional URLs:
Keywords: Stochastic Process, N-grams, Query Reformulation, Web Queries, Web Sessions
DOI: 10.1145/1520340.1520592
ISBN: 9781605582474
Divisions: Past > QUT Faculties & Divisions > Faculty of Science and Technology
Copyright Owner: Copyright 2009 Association for Computing Machinery Press
Deposited On: 27 Nov 2009 00:45
Last Modified: 29 Feb 2012 14:09

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page