Follow
Matteo Papini
Title
Cited by
Cited by
Year
Stochastic variance-reduced policy gradient
M Papini, D Binaghi, G Canonaco, M Pirotta, M Restelli
Proceedings of the 35th International Conference on Machine Learning 80 …, 2018
2162018
Feature selection via mutual information: New theoretical insights
M Beraha, AM Metelli, M Papini, A Tirinzoni, M Restelli
2019 international joint conference on neural networks (IJCNN), 1-9, 2019
1232019
Policy optimization via importance sampling
AM Metelli, M Papini, F Faccio, M Restelli
Advances in Neural Information Processing Systems 31, 2018
1142018
Risk-averse trust region optimization for reward-volatility reduction
L Bisi, L Sabbioni, E Vittori, M Papini, M Restelli
arXiv preprint arXiv:1912.03193, 2019
742019
Importance sampling techniques for policy optimization
AM Metelli, M Papini, N Montali, M Restelli
Journal of Machine Learning Research 21 (141), 1-75, 2020
592020
Gradient-aware model-based policy search
P D'Oro, AM Metelli, A Tirinzoni, M Papini, M Restelli
Proceedings of the AAAI Conference on Artificial Intelligence 34 (04), 3801-3808, 2020
482020
Adaptive batch size for safe policy gradients
M Papini, M Pirotta, M Restelli
Advances in neural information processing systems 30, 2017
482017
Smoothing policies and safe policy gradients
M Papini, M Pirotta, M Restelli
Machine Learning 111 (11), 4081-4137, 2022
452022
Optimistic policy optimization via multiple importance sampling
M Papini, AM Metelli, L Lupo, M Restelli
International Conference on Machine Learning, 4989-4999, 2019
412019
Leveraging good representations in linear contextual bandits
M Papini, A Tirinzoni, M Restelli, A Lazaric, M Pirotta
International Conference on Machine Learning, 8371-8380, 2021
342021
Reinforcement learning in linear mdps: Constant regret and representation selection
M Papini, A Tirinzoni, A Pacchiano, M Restelli, A Lazaric, M Pirotta
Advances in Neural Information Processing Systems 34, 16371-16383, 2021
232021
Lifting the information ratio: An information-theoretic analysis of thompson sampling for contextual bandits
G Neu, I Olkhovskaia, M Papini, L Schwartz
Advances in Neural Information Processing Systems 35, 9486-9498, 2022
212022
Balancing learning speed and stability in policy gradient via adaptive exploration
M Papini, A Battistello, M Restelli
International conference on artificial intelligence and statistics, 1188-1199, 2020
192020
Policy optimization as online learning with mediator feedback
AM Metelli, M Papini, P D'Oro, M Restelli
Proceedings of the AAAI Conference on Artificial Intelligence 35 (10), 8958-8966, 2021
152021
Importance-weighted offline learning done right
G Gabbianelli, G Neu, M Papini
International Conference on Algorithmic Learning Theory, 614-634, 2024
102024
Offline primal-dual reinforcement learning for linear mdps
G Gabbianelli, G Neu, M Papini, NM Okolo
International Conference on Artificial Intelligence and Statistics, 3169-3177, 2024
92024
Online adversarial mdps with off-policy feedback and known transitions
F Bacchiocchi, FE Stradi, M Papini, AM Metelli, N Gatti
Sixteenth European Workshop on Reinforcement Learning, 2023
92023
No-regret reinforcement learning in smooth mdps
D Maran, AM Metelli, M Papini, M Restell
arXiv preprint arXiv:2402.03792, 2024
72024
Scalable representation learning in linear contextual bandits with constant regret guarantees
A Tirinzoni, M Papini, A Touati, A Lazaric, M Pirotta
Advances in Neural Information Processing Systems 35, 2307-2319, 2022
72022
Projection by convolution: Optimal sample complexity for reinforcement learning in continuous-space mdps
D Maran, AM Metelli, M Papini, M Restelli
The Thirty Seventh Annual Conference on Learning Theory, 3743-3774, 2024
52024
The system can't perform the operation now. Try again later.
Articles 1–20