Standard

Reinforcement Learning in POMDPs with Memoryless Options and Option-Observation Initiation Sets. / Steckelmacher, Denis; Roijers, Diederik; Harutyunyan, Anna; Vrancx, Peter; Plisnier, Helene; Nowe, Ann.

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence. AAAI Press, 2018. p. 4099-4106 4099 (AAAI Conference on Artificial Intelligence).

Research output: Chapter in Book/Report/Conference proceedingConference paperResearch

Harvard

Steckelmacher, D, Roijers, D, Harutyunyan, A, Vrancx, P, Plisnier, H & Nowe, A 2018, Reinforcement Learning in POMDPs with Memoryless Options and Option-Observation Initiation Sets. in Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence., 4099, AAAI Conference on Artificial Intelligence, AAAI Press, pp. 4099-4106, Thirty-Second AAAI Conference on Artificial Intelligence (AAAI 2018), New Orleans, United States, 2/02/18.

APA

Steckelmacher, D., Roijers, D., Harutyunyan, A., Vrancx, P., Plisnier, H., & Nowe, A. (2018). Reinforcement Learning in POMDPs with Memoryless Options and Option-Observation Initiation Sets. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (pp. 4099-4106). [4099] (AAAI Conference on Artificial Intelligence). AAAI Press.

Vancouver

Steckelmacher D, Roijers D, Harutyunyan A, Vrancx P, Plisnier H, Nowe A. Reinforcement Learning in POMDPs with Memoryless Options and Option-Observation Initiation Sets. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence. AAAI Press. 2018. p. 4099-4106. 4099. (AAAI Conference on Artificial Intelligence).

Author

Steckelmacher, Denis ; Roijers, Diederik ; Harutyunyan, Anna ; Vrancx, Peter ; Plisnier, Helene ; Nowe, Ann. / Reinforcement Learning in POMDPs with Memoryless Options and Option-Observation Initiation Sets. Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence. AAAI Press, 2018. pp. 4099-4106 (AAAI Conference on Artificial Intelligence).

BibTeX

@inproceedings{aacf5fa5e2dd4fce9b34d78517ae9d11,
title = "Reinforcement Learning in POMDPs with Memoryless Options and Option-Observation Initiation Sets",
abstract = "Many real-world reinforcement learning problems have a hierarchical nature, and often exhibit some degree of partial observability. While hierarchy and partial observability are usually tackled separately (for instance by combining recurrent neural networks and options), we show that addressing both problems simultaneously is simpler and more efficient in many cases. More specifically, we make the initiation set of options conditional on the previously-executed option, and show that options with such Option-Observation Initiation Sets (OOIs) are at least as expressive as Finite State Controllers (FSCs), a state-of-the-art approach for learning in POMDPs. OOIs are easy to design based on an intuitive description of the task, lead to explainable policies and keep the top-level and option policies memoryless. Our experiments show that OOIs allow agents to learn optimal policies in challenging POMDPs, while being much more sample-efficient than a recurrent neural network over options.",
author = "Denis Steckelmacher and Diederik Roijers and Anna Harutyunyan and Peter Vrancx and Helene Plisnier and Ann Nowe",
year = "2018",
month = "2",
day = "4",
language = "English",
isbn = "978-1-57735-800-8",
series = "AAAI Conference on Artificial Intelligence",
publisher = "AAAI Press",
pages = "4099--4106",
booktitle = "Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence",

}

RIS

TY - GEN

T1 - Reinforcement Learning in POMDPs with Memoryless Options and Option-Observation Initiation Sets

AU - Steckelmacher, Denis

AU - Roijers, Diederik

AU - Harutyunyan, Anna

AU - Vrancx, Peter

AU - Plisnier, Helene

AU - Nowe, Ann

PY - 2018/2/4

Y1 - 2018/2/4

N2 - Many real-world reinforcement learning problems have a hierarchical nature, and often exhibit some degree of partial observability. While hierarchy and partial observability are usually tackled separately (for instance by combining recurrent neural networks and options), we show that addressing both problems simultaneously is simpler and more efficient in many cases. More specifically, we make the initiation set of options conditional on the previously-executed option, and show that options with such Option-Observation Initiation Sets (OOIs) are at least as expressive as Finite State Controllers (FSCs), a state-of-the-art approach for learning in POMDPs. OOIs are easy to design based on an intuitive description of the task, lead to explainable policies and keep the top-level and option policies memoryless. Our experiments show that OOIs allow agents to learn optimal policies in challenging POMDPs, while being much more sample-efficient than a recurrent neural network over options.

AB - Many real-world reinforcement learning problems have a hierarchical nature, and often exhibit some degree of partial observability. While hierarchy and partial observability are usually tackled separately (for instance by combining recurrent neural networks and options), we show that addressing both problems simultaneously is simpler and more efficient in many cases. More specifically, we make the initiation set of options conditional on the previously-executed option, and show that options with such Option-Observation Initiation Sets (OOIs) are at least as expressive as Finite State Controllers (FSCs), a state-of-the-art approach for learning in POMDPs. OOIs are easy to design based on an intuitive description of the task, lead to explainable policies and keep the top-level and option policies memoryless. Our experiments show that OOIs allow agents to learn optimal policies in challenging POMDPs, while being much more sample-efficient than a recurrent neural network over options.

UR - http://www.scopus.com/inward/record.url?scp=85060466811&partnerID=8YFLogxK

M3 - Conference paper

SN - 978-1-57735-800-8

T3 - AAAI Conference on Artificial Intelligence

SP - 4099

EP - 4106

BT - Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence

PB - AAAI Press

ER -

ID: 36288049