Many real-world reinforcement learning problems have a hierarchical nature, and often exhibit some degree of partial observability. While hierarchy and partial observability are usually tackled separately (for instance by combining recurrent neural networks and options), we show that addressing both problems simultaneously is simpler and more efficient in many cases. More specifically, we make the initiation set of options conditional on the previously-executed option, and show that options with such Option-Observation Initiation Sets (OOIs) are at least as expressive as Finite State Controllers (FSCs), a state-of-the-art approach for learning in POMDPs. OOIs are easy to design based on an intuitive description of the task, lead to explainable policies and keep the top-level and option policies memoryless. Our experiments show that OOIs allow agents to learn optimal policies in challenging POMDPs, while being much more sample-efficient than a recurrent neural network over options.
Original languageEnglish
Title of host publicationProceedings of the Thirty-Second AAAI Conference on Artificial Intelligence
PublisherAAAI Press
Number of pages8
ISBN (Electronic)9781577358008
ISBN (Print)978-1-57735-800-8
Publication statusPublished - 4 Feb 2018
EventThirty-Second AAAI Conference on Artificial Intelligence (AAAI 2018) - Hilton Riverside Hotel, New Orleans, United States
Duration: 2 Feb 20187 Feb 2018
Conference number: 32

Publication series

NameAAAI Conference on Artificial Intelligence
PublisherAssociation for the Advancement of Artificial Intelligence
ISSN (Print)2159-5399
ISSN (Electronic)2374-3468


ConferenceThirty-Second AAAI Conference on Artificial Intelligence (AAAI 2018)
Abbreviated titleAAAI
CountryUnited States
CityNew Orleans
Internet address

ID: 36288049