Standard

Transfer Reinforcement Learning across Environment Dynamics with Multiple Advisors. / Plisnier, Helene; Steckelmacher, Denis; Roijers, Diederik; Nowe, Ann.

Proceedings of the 31st Benelux Conference on Artificial Intelligence (BNAIC 2019). Vol. 2491 CEUR Workshop Proceedings, 2019. 11 (CEUR Workshop Proceedings; Vol. 2491, No. 11).

Research output: Chapter in Book/Report/Conference proceedingConference paper

Harvard

Plisnier, H, Steckelmacher, D, Roijers, D & Nowe, A 2019, Transfer Reinforcement Learning across Environment Dynamics with Multiple Advisors. in Proceedings of the 31st Benelux Conference on Artificial Intelligence (BNAIC 2019). vol. 2491, 11, CEUR Workshop Proceedings, no. 11, vol. 2491, CEUR Workshop Proceedings, 31st Benelux Conference on Artificial Intelligence and the 28th Belgian Dutch Conference on Machine Learning, BNAIC/BENELEARN 2019, Brussels, Belgium, 6/11/19.

APA

Plisnier, H., Steckelmacher, D., Roijers, D., & Nowe, A. (2019). Transfer Reinforcement Learning across Environment Dynamics with Multiple Advisors. In Proceedings of the 31st Benelux Conference on Artificial Intelligence (BNAIC 2019) (Vol. 2491). [11] (CEUR Workshop Proceedings; Vol. 2491, No. 11). CEUR Workshop Proceedings.

Vancouver

Plisnier H, Steckelmacher D, Roijers D, Nowe A. Transfer Reinforcement Learning across Environment Dynamics with Multiple Advisors. In Proceedings of the 31st Benelux Conference on Artificial Intelligence (BNAIC 2019). Vol. 2491. CEUR Workshop Proceedings. 2019. 11. (CEUR Workshop Proceedings; 11).

Author

Plisnier, Helene ; Steckelmacher, Denis ; Roijers, Diederik ; Nowe, Ann. / Transfer Reinforcement Learning across Environment Dynamics with Multiple Advisors. Proceedings of the 31st Benelux Conference on Artificial Intelligence (BNAIC 2019). Vol. 2491 CEUR Workshop Proceedings, 2019. (CEUR Workshop Proceedings; 11).

BibTeX

@inproceedings{49b4f0e24ff6401a8cf39a6504f822e5,
title = "Transfer Reinforcement Learning across Environment Dynamics with Multiple Advisors",
abstract = "Sample-efficiency is crucial in reinforcement learning tasks, especially when a large number of similar yet distinct tasks have to be learned. For example, consider a smart wheelchair learning to exit many differently-furnished offices on a building floor. Sequentially learning each of these tasks from scratch would be highly inefficient. A step towards a satisfying solution is the use of transfer learning: exploiting the knowledge acquired in previous (or source) tasks to tackle new (or target) tasks. Existing work mainly focuses on exploiting only one source policy as an advisor for the fresh agent, even when there are several expert source policies available. However, using only one advisor requires artificial mechanisms to limit its influence in areas where the source task and the target task differ, in order for the advisee not to be misled. In this paper, we present a novel approach to transfer learning in which all available source policies are exploited to help learn several related new tasks. Moreover, our approach is compatible with tasks that differ by their transition functions, which is rarely considered in the transfer reinforcement learning literature. Our in-depth empirical evaluation demonstrates that our approach significantly improves sample-efficiency.",
author = "Helene Plisnier and Denis Steckelmacher and Diederik Roijers and Ann Nowe",
year = "2019",
month = "11",
day = "6",
language = "English",
volume = "2491",
series = "CEUR Workshop Proceedings",
publisher = "CEUR Workshop Proceedings",
number = "11",
booktitle = "Proceedings of the 31st Benelux Conference on Artificial Intelligence (BNAIC 2019)",

}

RIS

TY - GEN

T1 - Transfer Reinforcement Learning across Environment Dynamics with Multiple Advisors

AU - Plisnier, Helene

AU - Steckelmacher, Denis

AU - Roijers, Diederik

AU - Nowe, Ann

PY - 2019/11/6

Y1 - 2019/11/6

N2 - Sample-efficiency is crucial in reinforcement learning tasks, especially when a large number of similar yet distinct tasks have to be learned. For example, consider a smart wheelchair learning to exit many differently-furnished offices on a building floor. Sequentially learning each of these tasks from scratch would be highly inefficient. A step towards a satisfying solution is the use of transfer learning: exploiting the knowledge acquired in previous (or source) tasks to tackle new (or target) tasks. Existing work mainly focuses on exploiting only one source policy as an advisor for the fresh agent, even when there are several expert source policies available. However, using only one advisor requires artificial mechanisms to limit its influence in areas where the source task and the target task differ, in order for the advisee not to be misled. In this paper, we present a novel approach to transfer learning in which all available source policies are exploited to help learn several related new tasks. Moreover, our approach is compatible with tasks that differ by their transition functions, which is rarely considered in the transfer reinforcement learning literature. Our in-depth empirical evaluation demonstrates that our approach significantly improves sample-efficiency.

AB - Sample-efficiency is crucial in reinforcement learning tasks, especially when a large number of similar yet distinct tasks have to be learned. For example, consider a smart wheelchair learning to exit many differently-furnished offices on a building floor. Sequentially learning each of these tasks from scratch would be highly inefficient. A step towards a satisfying solution is the use of transfer learning: exploiting the knowledge acquired in previous (or source) tasks to tackle new (or target) tasks. Existing work mainly focuses on exploiting only one source policy as an advisor for the fresh agent, even when there are several expert source policies available. However, using only one advisor requires artificial mechanisms to limit its influence in areas where the source task and the target task differ, in order for the advisee not to be misled. In this paper, we present a novel approach to transfer learning in which all available source policies are exploited to help learn several related new tasks. Moreover, our approach is compatible with tasks that differ by their transition functions, which is rarely considered in the transfer reinforcement learning literature. Our in-depth empirical evaluation demonstrates that our approach significantly improves sample-efficiency.

M3 - Conference paper

VL - 2491

T3 - CEUR Workshop Proceedings

BT - Proceedings of the 31st Benelux Conference on Artificial Intelligence (BNAIC 2019)

PB - CEUR Workshop Proceedings

ER -

ID: 49510154