Learning deployable navigation policies at kilometer scale from a single traversal

, , Mirowski, Piotr, Hadsell, Raia, & (2018) Learning deployable navigation policies at kilometer scale from a single traversal. In Dragan, A, Peters, J, Billard, A, & Morimoto, J (Eds.) Proceedings of Machine Learning Research (PMLR), Volume 87: Conference on Robot Learning 2018. Proceedings of Machine Learning Research, http://proceedings.mlr.press/, pp. 346-361.

[img]
Preview
Published Version (PDF 8MB)
bruce18a.pdf.

Description

Model-free reinforcement learning has recently been shown to be effective at learning navigation policies from complex image input. However, these algorithms tend to require large amounts of interaction with the environment, which can be prohibitively costly to obtain on robots in the real world. We present an approach for efficiently learning goal-directed navigation policies on a mobile robot, from only a single coverage traversal of recorded data. The navigation agent learns an effective policy over a diverse action space in a large heterogeneous environment consisting of more than 2km of travel, through buildings and outdoor regions that collectively exhibit large variations in visual appearance, self-similarity, and connectivity. We compare pretrained visual encoders that enable precomputation of visual embeddings to achieve a throughput of tens of thousands of transitions per second at training time on a commodity desktop computer, allowing agents to learn from millions of trajectories of experience in a matter of hours. We propose multi- ple forms of computationally efficient stochastic augmentation to enable the learned policy to generalise beyond these precomputed embeddings, and demonstrate successful deployment of the learned policy on the real robot without fine tuning, despite environmental appearance differences at test time. The dataset and code required to reproduce these results and apply the technique to other datasets and robots is made publicly available at rl-navigation.github.io/deployable .

Impact and interest:

13 citations in Scopus
Search Google Scholar™

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

47 since deposited on 10 Jan 2019
15 in the past twelve months

Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

ID Code: 124208
Item Type: Chapter in Book, Report or Conference volume (Conference contribution)
ORCID iD:
Suenderhauf, Nikoorcid.org/0000-0001-5286-3789
Milford, Michaelorcid.org/0000-0002-5162-1793
Measurements or Duration: 16 pages
Event Title: Conference on Robot Learning
Event Dates: 2018-10-29 - 2018-10-31
Event Location: UNSPECIFIED
ISBN: 1938-7228
Pure ID: 33312840
Divisions: Past > Institutes > Institute for Future Environments
Past > QUT Faculties & Divisions > Science & Engineering Faculty
Current > Research Centres > ARC Centre of Excellence for Robotic Vision
Funding:
Copyright Owner: Consult author(s) regarding copyright matters
Copyright Statement: This work is covered by copyright. Unless the document is being made available under a Creative Commons Licence, you must assume that re-use is limited to personal use and that permission from the copyright owner must be obtained for all other uses. If the document is available under a Creative Commons License (or other specified license) then refer to the Licence for details of permitted re-use. It is a condition of access that users recognise and abide by the legal requirements associated with these rights. If you believe that this work infringes copyright please provide details by email to qut.copyright@qut.edu.au
Deposited On: 10 Jan 2019 11:46
Last Modified: 16 May 2026 00:34