Mohammad Ghavamzadeh
Title
Cited by
Cited by
Year
Natural Actor–critic Algorithms
S Bhatnagar, RS Sutton, M Ghavamzadeh, M Lee
Automatica 45 (11), 2471-2482, 2009
664*2009
Bayesian Reinforcement Learning: A Survey
M Ghavamzadeh, S Mannor, J Pineau, A Tamar
Foundations and Trends in Machine Learning 8 (5-6), 359-483, 2015
2802015
Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence
V Gabillon, M Ghavamzadeh, A Lazaric
Neural Information Processing Systems, 3221-3229, 2012
2282012
A Lyapunov-based Approach to Safe Reinforcement Learning
Y Chow, O Nachum, E Duenez-Guzman, M Ghavamzadeh
Neural Information Processing Systems, 8103-8112, 2018
2112018
High-confidence Off-policy Evaluation
P Thomas, G Theocharous, M Ghavamzadeh
AAAI, 3000-3006, 2015
1902015
Risk-constrained Reinforcement Learning with Percentile Risk Criteria
Y Chow, M Ghavamzadeh, L Janson, M Pavone
Journal of Machine Learning Research (JMLR) 18, 6070-6120, 2017
1792017
Regularized Policy Iteration
AM Farahmand, M Ghavamzadeh, C Szepesvári, S Mannor
Neural Information Processing Systems, 441-448, 2008
1622008
Hierarchical Multi-agent Reinforcement Learning
R Makar, S Mahadevan, M Ghavamzadeh
International Conference on Autonomous Agents, 246-253, 2001
1592001
Hierarchical Multi-agent Reinforcement Learning
M Ghavamzadeh, S Mahadevan, R Makar
Journal of Autonomous Agents and Multi-Agent Systems (JAAMAS) 13 (2), 197-229, 2006
1532006
High Confidence Policy Improvement
P Thomas, G Theocharous, M Ghavamzadeh
ICML, 2380-2388, 2015
1512015
Supervised actor-critic reinforcement learning
MT Rosenstein, AG Barto, J Si, A Barto, W Powell, D Wunsch
Learning and Approximate Dynamic Programming: Scaling Up to the Real World …, 2004
1502004
Speedy Q-learning
M Ghavamzadeh, H Kappen, M Azar, R Munos
Neural Information Processing Systems 24, 2411-2419, 2011
142*2011
More Robust Doubly Robust Off-policy Evaluation
M Farajtabar, Y Chow, M Ghavamzadeh
ICML, 1447-1456, 2018
1412018
Finite-Sample Analysis of Proximal Gradient TD Algorithms
B Liu, J Liu, M Ghavamzadeh, S Mahadevan, M Petrik
UAI, 504-513, 2015
126*2015
Personalized Ad Recommendation Systems for Life-time Value Optimization with Guarantees
G Theocharous, PS Thomas, M Ghavamzadeh
IJCAI, 1806-1812, 2015
1172015
Algorithms for CVaR Optimization in MDPs
Y Chow, M Ghavamzadeh
Advances in Neural Information Processing Systems, 3509-3517, 2014
1172014
Bayesian Multi-task Reinforcement Learning
A Lazaric, M Ghavamzadeh
ICML, 599-606, 2010
1102010
Safe policy improvement by minimizing robust baseline regret
M Ghavamzadeh, M Petrik, Y Chow
Advances in Neural Information Processing Systems 29, 2298-2306, 2016
1062016
Finite-sample Analysis of Least-squares Policy Iteration
A Lazaric, M Ghavamzadeh, R Munos
Journal of Machine Learning Research (JMLR) 13, 3041-3074, 2012
972012
Multi-bandit Best Arm Identification
V Gabillon, M Ghavamzadeh, A Lazaric, S Bubeck
Advances in Neural Information Processing Systems, 2222-2230, 2011
972011
The system can't perform the operation now. Try again later.
Articles 1–20