Learning, planning, and representing knowledge at multiple levels of temporal abstractions provide an agent with the ability to predict consequences of different courses of actions, which is essential for improving the performance of sequential decision making. However, discovering effective temporal abstractions, which the agent can use as skills, and adopting the constructed temporal abstractions for efficient policy learning can be challenging. Despite significant advancements in single-agent settings, temporal abstractions in multi-agent systems remain underexplored. Our work addresses this research gap by introducing novel algorithms for discovering and employing temporal abstractions in both cooperative and competitive multi-agent environments. We first develop an unsupervised spectral-analysis-based discovery algorithm, aiming at finding temporal abstractions that can enhance the joint exploration of agents in complex, unknown environments for goal-achieving tasks. Subsequently, we propose a variational method that is applicable for a broader range of collaborative multi-agent tasks. This method unifies dynamic grouping and automatic multi-agent temporal abstraction discovery, and can be seamlessly integrated into the commonly-used multi-agent reinforcement learning algorithms. Further, for competitive multi-agent zero-sum games, we develop an algorithm based on Counterfactual Regret Minimization, which enables agents to form and utilize strategic abstractions akin to routine moves in chess during strategy learning, supported by solid theoretical and empirical analyses. Collectively, these contributions not only advance the understanding of multi-agent temporal abstractions but also present practical algorithms for intricate multi-agent challenges, including control, planning, and decision-making in complex scenarios.
Speaker: Dr. Jiayu CHEN
Date: 14 October 2024 (Monday)
Time: 9:15am – 10:15am
Poster: Click here
Latest Seminar
Biography
Dr Jiayu CHEN is a postdoctoral fellow at the School of Computer Science, Carnegie Mellon University, with a Ph.D. in Industrial Engineering and Operations Research from Purdue University and a Bachelor of Engineering from Peking University. His research focuses on deep learning for sequential decision-making, aiming to develop robust algorithms deployable in real-world applications such as robotic control and operation management. He has published in top venues like NeurIPS, ICML, ICRA, ICAPS, AAMAS, TMLR, and IEEE TNNLS, as the first author, and has received awards such as the Oracle Research Award and the Purdue Research Grant.