The Learning Component of Dynamic Allocation Indices

Gittins, J. & Wang, Y-G. (1992) The Learning Component of Dynamic Allocation Indices. The Annals of Statistics, 20(3), pp. 1625-1636.

View at publisher (open access)


For a multiarmed bandit problem with exponential discounting the optimal allocation rule is defined by a dynamic allocation index defined for each arm on its space. The index for an arm is equal to the expected immediate reward from the arm, with an upward adjustment reflecting any uncertainty about the prospects of obtaining rewards from the arm, and the possibilities of resolving those uncertainties by selecting that arm. Thus the learning component of the index is defined to be the difference between the index and the expected immediate reward. For two arms with the same expected immediate reward the learning component should be larger for the arm for which the reward rate is more uncertain. This is shown to be true for arms based on independent samples from a fixed distribution with an unknown parameter in the cases of Bernoulli and normal distributions, and similar results are obtained in other cases.

Impact and interest:

14 citations in Web of Science®
Search Google Scholar™

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

ID Code: 90633
Item Type: Journal Article
Refereed: Yes
Additional Information: ISI Document Delivery No.: KA502
Times Cited: 14
Cited Reference Count: 10
Gittins, j wang, yg
Inst mathematical statistics
Keywords: dynamic allocation index, gittins index, multiarmed bandit, target, processes
DOI: 10.1214/aos/1176348788
ISSN: 0090-5364
Divisions: Current > QUT Faculties and Divisions > Science & Engineering Faculty
Copyright Owner: Copyright Institute of Mathematical Statistics
Deposited On: 24 Nov 2015 06:25
Last Modified: 24 Nov 2015 06:25

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page