Tampere University of Technology

TUTCRIS Research Portal

Optimal sensing via multi-armed bandit relaxations in mixed observability domains

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review


Original languageEnglish
Title of host publication2015 IEEE International Conference on Robotics and Automation (ICRA), 26-30 May 2015, Seattle, WA
Number of pages6
Publication statusPublished - 29 Jun 2015
Publication typeA4 Article in a conference publication
EventIEEE International Conference on Robotics and Automation -
Duration: 1 Jan 19001 Jan 2000


ConferenceIEEE International Conference on Robotics and Automation


Sequential decision making under uncertainty is studied in a mixed observability domain. The goal is to maximize the amount of information obtained on a partially observable stochastic process under constraints imposed by a fully observable internal state. An upper bound for the optimal value function is derived by relaxing constraints. We identify conditions under which the relaxed problem is a multi-armed bandit whose optimal policy is easily computable. The upper bound is applied to prune the search space in the original problem, and the effect on solution quality is assessed via simulation experiments. Empirical results show effective pruning of the search space in a target monitoring domain.

Downloads statistics

No data available