Follow
Ziang Song
Ziang Song
Verified email at stanford.edu
Title
Cited by
Cited by
Year
When can we learn general-sum Markov games with a large number of players sample-efficiently?
Z Song, S Mei, Y Bai
arXiv preprint arXiv:2110.04184, 2021
862021
Efficient Phi-Regret Minimization in Extensive-Form Games via Online Mirror Descent
Y Bai, C Jin, S Mei, Z Song, T Yu
Advances in Neural Information Processing Systems 35, 22313-22325, 2022
132022
Reward collapse in aligning large language models
Z Song, T Cai, JD Lee, WJ Su
arXiv preprint arXiv:2305.17608, 2023
122023
Sample-efficient learning of correlated equilibria in extensive-form games
Z Song, S Mei, Y Bai
Advances in Neural Information Processing Systems 35, 4099-4110, 2022
112022
Reward Collapse in Aligning Large Language Models: A Prompt-Aware Approach to Preference Rankings
Z Song, T Cai, JD Lee, WJ Su
ICML 2023 Workshop The Many Facets of Preference-Based Learning, 2023
12023
The system can't perform the operation now. Try again later.
Articles 1–5