Google Scholar link
Itemized publications
Max-Margin Token Selection in Attention Mechanism, NeurIPS spotlight (~3% acceptance), 2023
FedNest: Federated Bilevel, Minimax, and Compositional Optimization, ICML oral (~2% acceptance), 2022
Revisiting Ho-Kalman based system identification: robustness and finite-sample analysis, IEEE TAC 2021
Label-Imbalanced and Group-Sensitive Classification under Overparameterization, NeurIPS 2021
Generalization Guarantees for Neural Architecture Search with Train-Validation Split, ICML 2021
Provable Benefits of Overparameterization in Model Compression: From Double Descent to Pruning Neural Networks, AAAI 2021
Towards moderate overparameterization: global convergence guarantees for training shallow neural networks, IEEE Journal on Selected Areas in Information Theory 2020
Gradient descent with early stopping is provably robust to label noise for overparameterized neural networks, AISTATS 2020
Overparameterized Nonlinear Learning: Gradient Descent Takes the Shortest Path?, ICML 2019
Sharp time–data tradeoffs for linear inverse problems, Transactions on Information Theory 2018
Universality laws for randomized dimension reduction, with applications, Information & Inference 2018
Regularized linear regression: A precise analysis of the estimation error, COLT 2015
Simultaneously structured models with application to sparse and low-rank matrices, Transactions on Information Theory 2015
The squared-error of generalized LASSO: A precise analysis, Allerton 2013