Jakub Grudzien Kuba
Cited by
Cited by
Trust region policy optimisation in multi-agent reinforcement learning
JG Kuba, R Chen, M Wen, Y Wen, F Sun, J Wang, Y Yang
International Conference on Learning Representations 2022, 2021
Multi-agent reinforcement learning is a sequence modeling problem
M Wen, J Kuba, R Lin, W Zhang, Y Wen, J Wang, Y Yang
Advances in Neural Information Processing Systems 35, 16509-16521, 2022
Safe multi-agent reinforcement learning for multi-robot control
S Gu, JG Kuba, Y Chen, Y Du, L Yang, A Knoll, Y Yang
Artificial Intelligence 319, 103905, 2023
Idql: Implicit q-learning as an actor-critic method with diffusion policies
P Hansen-Estruch, I Kostrikov, M Janner, JG Kuba, S Levine
arXiv preprint arXiv:2304.10573, 2023
Settling the variance of multi-agent policy gradients
JG Kuba, M Wen, L Meng, H Zhang, D Mguni, J Wang, Y Yang
Advances in Neural Information Processing Systems 34, 13458-13470, 2021
Discovered policy optimisation
C Lu, J Kuba, A Letcher, L Metz, C Schroeder de Witt, J Foerster
Advances in Neural Information Processing Systems 35, 16455-16468, 2022
Heterogeneous-agent mirror learning: A continuum of solutions to cooperative marl
JG Kuba, X Feng, S Ding, H Dong, J Wang, Y Yang
arXiv preprint arXiv:2208.01682, 2022
Mirror learning: A unifying framework of policy optimisation
J Grudzien, CAS De Witt, J Foerster
International Conference on Machine Learning, 7825-7844, 2022
Understanding value decomposition algorithms in deep cooperative multi-agent reinforcement learning
Z Dou, JG Kuba, Y Yang
arXiv preprint arXiv:2202.04868, 2022
Functional Graphical Models: Structure Enables Offline Data-Driven Optimization
K Grudzien, M Uehara, S Levine, P Abbeel
International Conference on Artificial Intelligence and Statistics, 2449-2457, 2024
Advantage-Conditioned Diffusion: Offline RL via Generalization
JG Kuba, P Abbeel, S Levine
The system can't perform the operation now. Try again later.
Articles 1–11