Standard

Learning to Coordinate with Coordination Graphs in Repeated Single-Stage Multi-Agent Decision Problems. / Bargiacchi, Eugenio; Verstraeten, Timothy; Roijers, Diederik; Nowe, Ann; van Hasselt, Hado.

35th International Conference on Machine Learning, ICML 2018. ed. / Jennifer Dy; Andreas Krause. Vol. 2 2018. p. 810-818.

Research output: Chapter in Book/Report/Conference proceedingConference paper

Harvard

Bargiacchi, E, Verstraeten, T, Roijers, D, Nowe, A & van Hasselt, H 2018, Learning to Coordinate with Coordination Graphs in Repeated Single-Stage Multi-Agent Decision Problems. in J Dy & A Krause (eds), 35th International Conference on Machine Learning, ICML 2018. vol. 2, pp. 810-818, International Conference on Machine Learning 2018, Stockholm, Sweden, 10/07/18.

APA

Bargiacchi, E., Verstraeten, T., Roijers, D., Nowe, A., & van Hasselt, H. (2018). Learning to Coordinate with Coordination Graphs in Repeated Single-Stage Multi-Agent Decision Problems. In J. Dy, & A. Krause (Eds.), 35th International Conference on Machine Learning, ICML 2018 (Vol. 2, pp. 810-818)

Vancouver

Bargiacchi E, Verstraeten T, Roijers D, Nowe A, van Hasselt H. Learning to Coordinate with Coordination Graphs in Repeated Single-Stage Multi-Agent Decision Problems. In Dy J, Krause A, editors, 35th International Conference on Machine Learning, ICML 2018. Vol. 2. 2018. p. 810-818

Author

Bargiacchi, Eugenio ; Verstraeten, Timothy ; Roijers, Diederik ; Nowe, Ann ; van Hasselt, Hado. / Learning to Coordinate with Coordination Graphs in Repeated Single-Stage Multi-Agent Decision Problems. 35th International Conference on Machine Learning, ICML 2018. editor / Jennifer Dy ; Andreas Krause. Vol. 2 2018. pp. 810-818

BibTeX

@inproceedings{f4e3e29163004c2aa5b4101d65ef04b3,
title = "Learning to Coordinate with Coordination Graphs in Repeated Single-Stage Multi-Agent Decision Problems",
abstract = "Learning to coordinate between multiple agents is an important problem in many reinforcement learning problems. Key to learning to coordinate is exploiting loose couplings, i.e., conditional independences between agents. In this paper we study learning in repeated fully cooperative games, multi-agent multi-armed bandits (MAMABs), in which the expected rewards can be expressed as a coordination graph. We propose multi-agent upper confidence exploration (MAUCE), a new algorithm for MAMABs that exploits loose couplings, which enables us to prove a regret bound that is logarithmic in the number of arm pulls and only linear in the number of agents. We empirically compare MAUCE to sparse cooperative Q-learning, and a state-of-the-art combinatorial bandit approach, and show that it performs much better on a variety of settings, including learning control policies for wind farms.",
author = "Eugenio Bargiacchi and Timothy Verstraeten and Diederik Roijers and Ann Nowe and {van Hasselt}, Hado",
year = "2018",
language = "English",
volume = "2",
pages = "810--818",
editor = "Jennifer Dy and Andreas Krause",
booktitle = "35th International Conference on Machine Learning, ICML 2018",

}

RIS

TY - GEN

T1 - Learning to Coordinate with Coordination Graphs in Repeated Single-Stage Multi-Agent Decision Problems

AU - Bargiacchi, Eugenio

AU - Verstraeten, Timothy

AU - Roijers, Diederik

AU - Nowe, Ann

AU - van Hasselt, Hado

PY - 2018

Y1 - 2018

N2 - Learning to coordinate between multiple agents is an important problem in many reinforcement learning problems. Key to learning to coordinate is exploiting loose couplings, i.e., conditional independences between agents. In this paper we study learning in repeated fully cooperative games, multi-agent multi-armed bandits (MAMABs), in which the expected rewards can be expressed as a coordination graph. We propose multi-agent upper confidence exploration (MAUCE), a new algorithm for MAMABs that exploits loose couplings, which enables us to prove a regret bound that is logarithmic in the number of arm pulls and only linear in the number of agents. We empirically compare MAUCE to sparse cooperative Q-learning, and a state-of-the-art combinatorial bandit approach, and show that it performs much better on a variety of settings, including learning control policies for wind farms.

AB - Learning to coordinate between multiple agents is an important problem in many reinforcement learning problems. Key to learning to coordinate is exploiting loose couplings, i.e., conditional independences between agents. In this paper we study learning in repeated fully cooperative games, multi-agent multi-armed bandits (MAMABs), in which the expected rewards can be expressed as a coordination graph. We propose multi-agent upper confidence exploration (MAUCE), a new algorithm for MAMABs that exploits loose couplings, which enables us to prove a regret bound that is logarithmic in the number of arm pulls and only linear in the number of agents. We empirically compare MAUCE to sparse cooperative Q-learning, and a state-of-the-art combinatorial bandit approach, and show that it performs much better on a variety of settings, including learning control policies for wind farms.

UR - http://www.scopus.com/inward/record.url?scp=85057228542&partnerID=8YFLogxK

M3 - Conference paper

VL - 2

SP - 810

EP - 818

BT - 35th International Conference on Machine Learning, ICML 2018

A2 - Dy, Jennifer

A2 - Krause, Andreas

ER -

ID: 38960157