Follow
Kwangjun Ahn
Kwangjun Ahn
Microsoft Research
Verified email at microsoft.com - Homepage
Title
Cited by
Cited by
Year
Transformers learn to implement preconditioned gradient descent for in-context learning
K Ahn, X Cheng, H Daneshmand, S Sra
Advances in Neural Information Processing Systems (NeurIPS) 36, 2024
1562024
Optimal dimension dependence of the metropolis-adjusted langevin algorithm
S Chewi, C Lu, K Ahn, X Cheng, T Le Gouic, P Rigollet
Conference on Learning Theory (COLT), 1260-1300, 2021
852021
From Nesterov's Estimate Sequence to Riemannian Acceleration
K Ahn, S Sra
Proceedings of Thirty Third Conference on Learning Theory (COLT), PMLR 125 …, 2020
792020
Understanding the unstable convergence of gradient descent
K Ahn, J Zhang, S Sra
International Conference on Machine Learning, 247-257, 2022
772022
Sgd with shuffling: optimal rates without component convexity and large epoch requirements
K Ahn, C Yun, S Sra
Advances in Neural Information Processing Systems 33, 17526-17535, 2020
772020
Hypergraph spectral clustering in the weighted stochastic block model
K Ahn, K Lee, C Suh
IEEE Journal of Selected Topics in Signal Processing 12 (5), 959-974, 2018
742018
Efficient constrained sampling via the mirror-Langevin algorithm
K Ahn, S Chewi
Advances in Neural Information Processing Systems 34, 28405-28418, 2021
702021
Community recovery in hypergraphs
K Ahn, K Lee, C Suh
IEEE Transactions on Information Theory 65 (10), 6561-6579, 2019
432019
Learning threshold neurons via edge of stability
K Ahn, S Bubeck, S Chewi, YT Lee, F Suarez, Y Zhang
Advances in Neural Information Processing Systems (NeurIPS) 36, 2024
422024
Linear attention is (maybe) all you need (to understand transformer optimization)
K Ahn, X Cheng, M Song, C Yun, A Jadbabaie, S Sra
International Conference on Learning Representations (ICLR), 2023
362023
Binary rating estimation with graph side information
K Ahn, K Lee, H Cha, C Suh
Advances in neural information processing systems 31, 2018
362018
Graph Matrices: Norm Bounds and Applications
K Ahn, D Medarametla, A Potechin
arXiv preprint 1604.03423, 2020
35*2020
Mirror descent maximizes generalized margin and can be implemented efficiently
H Sun, K Ahn, C Thrampoulidis, N Azizan
Advances in Neural Information Processing Systems 35, 31089-31101, 2022
222022
Reproducibility in optimization: Theoretical framework and limits
K Ahn, P Jain, Z Ji, S Kale, P Netrapalli, GI Shamir
Advances in Neural Information Processing Systems 35, 18022-18033, 2022
202022
Riemannian perspective on matrix factorization
K Ahn, F Suarez
arXiv preprint arXiv:2102.00937, 2021
172021
Understanding Nesterov's Acceleration via Proximal Point Method
K Ahn, S Sra
Symposium on Simplicity in Algorithms (SOSA), 117-130, 2022
162022
The crucial role of normalization in sharpness-aware minimization
Y Dai, K Ahn, S Sra
Advances in Neural Information Processing Systems (NeurIPS) 36, 2024
142024
A Unified Approach to Controlling Implicit Regularization via Mirror Descent
H Sun, K Gatmiry, K Ahn, N Azizan
arXiv preprint arXiv:2306.13853, 2023
102023
One-pass learning via bridging orthogonal gradient descent and recursive least-squares
Y Min, K Ahn, N Azizan
2022 IEEE 61st Conference on Decision and Control (CDC), 4720-4725, 2022
102022
Understanding Adam optimizer via online learning of updates: Adam is FTRL in disguise
K Ahn, Z Zhang, Y Kook, Y Dai
International Conference on Machine Learning (ICML), 2024
92024
The system can't perform the operation now. Try again later.
Articles 1–20