Time series analysis of a Web search engine transaction log

Zhang, Ying, Jansen, Bernard J., & Spink, Amanda H. (2009) Time series analysis of a Web search engine transaction log. Information Processing and Management, 45(2), pp. 230-245.

View at publisher

Abstract

In this paper, we use time series analysis to evaluate predictive scenarios using search engine transactional logs. Our goal is to develop models for the analysis of searchers’ behaviors over time and investigate if time series analysis is a valid method for predicting relationships between searcher actions. Time series analysis is a method often used to understand the underlying characteristics of temporal data in order to make forecasts. In this study, we used a Web search engine transactional log and time series analysis to investigate users’ actions. We conducted our analysis in two phases. In the initial phase, we employed a basic analysis and found that 10% of searchers clicked on sponsored links. However, from 22:00 to 24:00, searchers almost exclusively clicked on the organic links, with almost no clicks on sponsored links. In the second and more extensive phase, we used a one-step prediction time series analysis method along with a transfer function method. The period rarely affects navigational and transactional queries, while rates for transactional queries vary during different periods. Our results show that the average length of a searcher session is approximately 2.9 interactions and that this average is consistent across time periods. Most importantly, our findings shows that searchers who submit the shortest queries (i.e., in number of terms) click on highest ranked results. We discuss implications, including predictive value, and future research.

Impact and interest:

30 citations in Scopus
Search Google Scholar™
16 citations in Web of Science®

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

ID Code: 28891
Item Type: Journal Article
Refereed: Yes
Additional URLs:
Keywords: ARIMA, Box–Jenkins model, Search engine, Time series analysis, Transactional log
DOI: 10.1016/j.ipm.2008.07.003
ISSN: 0306-4573
Divisions: Past > QUT Faculties & Divisions > Faculty of Science and Technology
Deposited On: 25 Nov 2009 23:38
Last Modified: 29 Feb 2012 13:52

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page