Tampere University of Technology

TUTCRIS Research Portal

Optimal sensing via multi-armed bandit relaxations in mixed observability domains

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Details

Original languageEnglish
Title of host publication2015 IEEE International Conference on Robotics and Automation (ICRA), 26-30 May 2015, Seattle, WA
Pages4807-4812
Number of pages6
Volume2015-June
DOIs
Publication statusPublished - 29 Jun 2015
Publication typeA4 Article in a conference publication
EventIEEE International Conference on Robotics and Automation -
Duration: 1 Jan 19001 Jan 2000

Conference

ConferenceIEEE International Conference on Robotics and Automation
Period1/01/001/01/00

Abstract

Sequential decision making under uncertainty is studied in a mixed observability domain. The goal is to maximize the amount of information obtained on a partially observable stochastic process under constraints imposed by a fully observable internal state. An upper bound for the optimal value function is derived by relaxing constraints. We identify conditions under which the relaxed problem is a multi-armed bandit whose optimal policy is easily computable. The upper bound is applied to prune the search space in the original problem, and the effect on solution quality is assessed via simulation experiments. Empirical results show effective pruning of the search space in a target monitoring domain.

Downloads statistics

No data available