Daniel Russo

Cited by

	All	Since 2020
Citations	5879	4915
h-index	18	17
i10-index	22	22

1200

600

300

900

201320142015201620172018201920202021202220232024202515 30 35 77 142 262 373 611 841 988 1076 1178 217

Public access

View all

2 articles

0 articles

available

not available

Based on funding mandates

Daniel Russo

Columbia University

Verified email at gsb.columbia.edu - Homepage


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
A tutorial on thompson sampling D Russo, B Van Roy, A Kazerouni, I Osband, Z Wen Foundations and Trends in Machine Learning 11 (1), 1–96, 2018	1285	2018
Learning to optimize via posterior sampling D Russo, B Van Roy Mathematics of Operations Research 39 (4), 1221-1243, 2014	816	2014
An information-theoretic analysis of thompson sampling D Russo, B Van Roy Journal of Machine Learning Research 17 (68), 1-30, 2016	479	2016
A finite time analysis of temporal difference learning with linear function approximation J Bhandari, D Russo, R Singal Operations Research 69 (3), 950--973, 2021	449	2021
How much does your data exploration overfit? Controlling bias via information usage. D Russo, J Zou IEEE Transactions on Information Theory, 2019	399*	2019
Learning to optimize via information-directed sampling D Russo, B Van Roy Operations Research 66 (1), 230-252, 2018	379*	2018
Deep exploration via randomized value functions I Osband, B Van Roy, DJ Russo, Z Wen Journal of Machine Learning Research 20 (124), 1-62, 2019	368	2019
Simple Bayesian Algorithms for Best-Arm Identification D Russo Operations Research 68 (6), 1625--1647, 2020	359*	2020
Eluder Dimension and the Sample Complexity of Optimistic Exploration. D Russo, B Van Roy Advances in Neural Information Processing Systems 26, 2256-2264, 2013	300	2013
Global optimality guarantees for policy gradient methods J Bhandari, D Russo Operations Research 72 (5), 1906-1927, 2024	294	2024
Improving the expected improvement algorithm C Qin, D Klabjan, D Russo Advances in Neural Information Processing Systems, 5382--5392, 2017	182	2017
(More) efficient reinforcement learning via posterior sampling I Osband, D Russo, B Van Roy Advances in Neural Information Processing Systems 26, 2013	121	2013
On the linear convergence of policy gradient methods for finite mdps J Bhandari, D Russo International Conference on Artificial Intelligence and Statistics, 2386-2394, 2021	105*	2021
Worst-case regret bounds for exploration via randomized value functions D Russo Advances in Neural Information Processing Systems 32, 2019	101	2019
Satisficing in time-sensitive bandit learning D Russo, B Van Roy Mathematics of Operations Research 47 (4), 2815-2839, 2022	70*	2022
Adaptive Experimentation in the Presence of Exogenous Nonstationary Variation C Qin, D Russo arXiv preprint arXiv:2202.09036, 2022	47*	2022
Impatient Bandits: Optimizing for the Long-Term Without Delay KW Zhang, T Baldwin-McDonald, K Ciosek, L Maystre, D Russo arXiv preprint arXiv:2501.07761, 2025	19*	2025
A note on the equivalence of upper confidence bounds and gittins indices for patient agents D Russo Operations Research 69 (1), 273-278, 2021	18	2021
Policy gradient optimization of Thompson sampling policies S Min, CC Moallemi, DJ Russo arXiv preprint arXiv:2006.16507, 2020	14	2020
Approximation benefits of policy gradient methods with aggregated states D Russo Management Science 69 (11), 6898-6911, 2023	12	2023

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by