Paper # |
Authors |
Title |
2 | Lin Shi and Bei Peng | Curriculum Learning for Relative Overgeneralization |
3 | Felipe Leno Da Silva, Jiachen Yang, Mikel Landajuela, Andre Goncalves, Alexander Ladd, Daniel Faissol and Brenden Petersen | Toward Multi-Fidelity Reinforcement Learning for Symbolic Optimization |
7 | Armaan Garg and Shashi Shekhar Jha | Autonomous Flood Area Coverage using Decentralized Multi-UAV System with Directed Explorations |
8 | Montaser Mohammedalamen, Dustin Morrill, Alexander Sieusahai, Yash Satsangi and Michael Bowling | Learning to Be Cautious |
9 | Seán Caulfield Curley, Karl Mason and Patrick Mannion | A Classification Based Approach to Identifying and Mitigating Adversarial Behaviours in Deep Reinforcement Learning Agents |
10 | Malek Mechergui and Sarath Sreedharan | Goal Alignment: Re-analyzing Value Alignment Problems Using Human-Aware AI |
12 | Qian Shao, Pradeep Varakantham and Shih-Fen Cheng | Cost Constrained Imitation Learning |
13 | Sebastian Schmid and Andreas Harth | Distributed Fault Detection For Multi-Agent Systems Based On Vertebrate Foraging |
14 | Bruno Rodrigues, Matthias Knorr, Ludwig Krippahl and Ricardo Gonçalves | Towards Explaining Actions of Learning Agents |
15 | Manel Rodriguez-Soto, Roxana Radulescu, Juan Antonio Rodriguez Aguilar, Maite Lopez-Sanchez and Ann Nowe | Multi-objective reinforcement learning for guaranteeing alignment with multiple values |
16 | Maxime Toquebiau, Nicolas Bredeche, Faïz Ben Amar and Jae Yun Jun Kim | Joint Intrinsic Motivation for Coordinated Exploration in Multi-Agent Reinforcement Learning |
17 | Minae Kwon, John Agapiou, Edgar Duéñez-Guzmán, Romuald Elie, Georgios Piliouras, Kalesha Bullard and Ian Gemp | Aligning Local Multiagent Incentives with Global Objectives |
18 | Ridhima Bector, Hang Xu, Abhay Aradhya, Chai Quek and Zinovi Rabinovich | Should Importance of an Attack's Future be Determined by its Current Success? |
19 | Ridhima Bector, Hang Xu, Abhay Aradhya, Chai Quek and Zinovi Rabinovich | Poisoning the Well: Can We Simultaneously Attack a Group of Learning Agents? |
20 | Junlin Lu, Patrick Mannion and Karl Mason | Inferring Preferences from Demonstrations in Multi-objective Reinforcement Learning: A Dynamic Weight-based Approach |
21 | Simon Vanneste, Astrid Vanneste, Tom De Schepper, Siegfried Mercelis, Peter Hellinckx and Kevin Mets | Distributed Critics using Counterfactual Value Decomposition in Multi-Agent Reinforcement Learning |
22 | Philipp Altmann, Thomy Phan, Fabian Ritz, Claudia Linnhoff-Popien and Thomas Gabor | DIRECT: Learning from Sparse and Shifting Rewards using Discriminative Reward Co-Training |
23 | Callum Rhys Tilbury, Filippos Christianos and Stefano V. Albrecht | Revisiting the Gumbel-Softmax in MADDPG |
24 | Louis Bagot, Lynn D'Eer, Steven Latre, Tom De Schepper and Kevin Mets | GPI-Tree Search: Algorithms for Decision-time Planning with the General Policy Improvement Theorem |
25 | Seongmin Kim, Woojun Kim, Jeewon Jeon, Youngchul Sung and Seungyul Han | Off-Policy Multi-Agent Policy Optimization with Multi-Step Counterfactual Advantage Estimation |
26 | Xue Yang, Enda Howley and Michael Schukat | ADT: Agent-based Dynamic Thresholding for Anomaly Detection |
27 | Nicole Orzan, Erman Acar, Davide Grossi and Roxana Rădulescu | Emergent Cooperation and Deception in Public Good Games |
28 | Henrik Müller, Lukas Berg and Daniel Kudenko | Using Incomplete and Incorrect Plans to Shape Reinforcement Learning in Long-Sequence Sparse-Reward Tasks |
29 | Matthew E. Taylor | Reinforcement Learning Requires Human-in-the-Loop Framing and Approaches |
30 | Adam Callaghan, Karl Mason and Patrick Mannion | Evolutionary Strategy guided Reinforcement Learning via MultiBuffer Communication |
31 | Abilmansur Zhumabekov, Daniel May, Tianyu Zhang, Aakash Krishna, Omid Ardakanian and Matthew Taylor | Ensembling Diverse Policies Improves Generalizability of Reinforcement Learning Algorithms in Continuous Control Tasks |
32 | Isuri Perera, Frits de Nijs and Julian Garcia | Learning to cooperate against ensembles of diverse opponents |
33 | Rory Lipkis and Adrian Agogino | Discovery and Analysis of Rare High-Impact Failure Modes Using Adversarial RL-Informed Sampling |
34 | Guanbao Yu, Umer Siddique and Paul Weng | Fair Deep Reinforcement Learning with Generalized Gini Welfare Functions |
35 | Archana Vadakattu, Michelle Blom and Adrian Pearce | Strategy Extraction in Single-agent Games |
37 | Lukas Schäfer, Oliver Slumbers, Stephen McAleer, Yali Du, Stefano Albrecht and David Mguni | Ensemble Value Functions for Efficient Exploration in Multi-Agent Reinforcement Learning |
38 | Robert Loftin, Mustafa Mert Çelikok, Herke Van Hoof, Samuel Kaski and Frans Oliehoek | Uncoupled Learning of Differential Stackelberg Equilibria with Commitments |
39 | Alexandra Cimpean, Pieter Libin, Youri Coppens, Catholijn Jonker and Ann Nowé | Towards Fairness In Reinforcement Learning |
40 | Danila Valko and Daniel Kudenko | Increasing Energy Efficiency of Bitcoin Infrastructure with Reinforcement Learning and One-shot Path Planning for the Lightning Network |
41 | Nicola Mc Donnell, Enda Howley and Jim Duggan | QD(λ) Learning: Towards Multi-agent Reinforcement Learning for Learning Communication Protocols |
42 | Mathieu Reymond, Florent Delgrange, Guillermo A. Pérez and Ann Nowé | WAE-PCN: Wasserstein-autoencoded Pareto Conditioned Networks |
44 | Changxi Zhu, Mehdi Dastani and Shihan Wang | Continuous Communication with Factorized Policy Gradients in Multi-agent Deep Reinforcement Learning |
45 | Jannis Weil, Johannes Czech, Tobias Meuser and Kristian Kersting | Know your Enemy: Investigating Monte-Carlo Tree Search with Opponent Models in Pommerman |
46 | Sotirios Nikoloutsopoulos, Iordanis Koutsopoulos and Michalis Titsias | Personalized Federated Learning with Exact Distributed Stochastic Gradient Descent Updates |
47 | Alain Andres, Lukas Schäfer, Esther Villar-Rodriguez, Stefano Albrecht and Javier Del Ser | Using Offline Data to Speed-up Reinforcement Learning in Procedurally Generated Environments |
48 | Yash Satsangi and Paniz Behboudian | Bandit-Based Policy Invariant Explicit Shaping |
50 | Johan Källström and Fredrik Heintz | Model-Based Multi-Objective Reinforcement Learning with Dynamic Utility Functions |
51 | Dimitris Michailidis, Willem Röpke, Sennay Ghebreab, Diederik M. Roijers and Fernando P. Santos | Fairness in Transport Network Design - A Multi-Objective Reinforcement Learning Approach |
52 | Raphael Avalos, Florent Delgrange, Ann Nowe, Guillermo A. Pérez and Diederik M. Roijers | The Wasserstein Believer: Learning Belief Updates for Partially Observable Environments through Reliable Latent Space Models |
53 | Jacobus Smit and Fernando P. Santos | Learning Fair Cooperation in Systems of Indirect Reciprocity |
54 | Kartik Bharadwaj, Chandrashekar Lakshminarayanan and Balaraman Ravindran | Continuous Tactical Optimism and Pessimism |
55 | Prashank Kadam, Ruiyang Xu and Karl Lieberherr | Accelerating Neural MCTS Algorithms using Neural Sub-Net Structures |
56 | Md. Saiful Islam, Srijita Das, Sai Krishna Gottipati, William Duguay, Cloderic Mars, Jalal Arabneydi, Antoine Fagette, Matthew Guzdial and Matthew E. Taylor | WIP: Human-AI interactions in real-world complex environments using a comprehensive reinforcement learning framework |
57 | Ward Gauderis, Fabian Denoodt, Bram Silue, Pierre Vanvolsem and Andries Rosseau | Efficient Bayesian Ultra-Q Learning for Multi-Agent Games |
59 | Anna Penzkofer, Simon Schaefer, Florian Strohm, Mihai Bâce, Stefan Leutenegger and Andreas Bulling | Int-HRL: Towards Intention-based Hierarchical Reinforcement Learning |
60 | Alex Goodall and Francesco Belardinelli | Approximate Shielding of Atari Agents for Safe Exploration |
62 | Yuan Xue, Megha Khosla and Daniel Kudenko | Regulating Action Value Estimation in Deep Reinforcement Learning |
63 | David Radke and Kyle Tilbury | Learning to Learn Group Alignment: A Self-Tuning Credo Framework with Multiagent Teams |
64 | Yuxuan Li, Qinglin Liu, Nan Lin and Matthew Taylor | Work in Progress: Integrating Human Preference and Human Feedback for Environmentally Adaptable Robotic Learning |
65 | Richard Willis and Michael Luck | Resolving social dilemmas through reward transfer commitments |
67 | Udari Madhushani, Kevin McKee, John Agapiou, Joel Leibo, Richard Everett, Thomas Anthony, Edward Hughes, Karl Tuyls and Edgar Duenez-Guzman | Heterogeneous Social Value Orientation Improves Meaningful Diversity in Various Incentive Structures |
69 | Luis Thomasini, Lucas Alegre, Gabriel De O. Ramos and Ana L. C. Bazzan | RouteChoiceGym: a Route Choice Library for Multiagent Reinforcement Learning |