## Publications## Preprints/In submission**SGD and nonlinear systems:**‘‘Stochastic Gradient Descent Learns State Equations with Nonlinear Activations’’, to appear at COLT 2019.*TL;DR:*The state equations of recurrent neural nets have nonlinear activations. We show that, these hidden nonlinear dynamics can be learned from data using SGD.
**Fundamentals of overparameterized optimization:**with Mahdi ‘‘Overparameterized Nonlinear Learning: Gradient Descent Takes the Shortest Path?’’, to appear at ICML 2019.*TL;DR:*We prove that gradient descent finds the (nearly) closest solution among all global minima. This leads to new insights for many ML problems including matrix factorization and neural nets.
**Noise robust neural nets:**with Mingchen and Mahdi ‘‘Gradient Descent is Provably Robust to Label Noise for Overparameterized Neural Network’’, preprint, March 2019.**Representation learning with ReLU:**with Salman ‘‘Exactly decoding a vector through ReLU activation’’, to appear at IEEE ICASSP 2019.**Finding global minima of neural nets:**with Mahdi ‘‘Towards moderate overparameterization: Global convergence guarantees for training neural networks’’, preprint, Jan 2019.**High-dimensional estimation:**with Yahya ‘‘Quickly finding the best linear model in high-dimensions’’, in submission, April 2019.
**Non-asymptotic system ID:**with Necmiye “Necmiye, ”Non-asymptotic Identification of LTI Systems from a Single Trajectory," preprint June 2018.*TL;DR:*We study the problem of learning a linear time-invariant dynamical system from its inputoutput samples. Given a single inputoutput trajectory, we provide finite time analysis for learning the system's Markov parameters, from which a balanced realization is obtained using the classical Ho-Kalman algorithm.
**Data Enrichment:**with Amir, Arindam, Kevin: “High Dimensional Data Enrichment: Interpretable, Fast, and Data-Efficient”, preprint May 2018.*TL;DR:*We consider groups of observations arising from shared and per-group individual parameters, each with its own structure such as sparsity or group sparsity. We provide data and computation efficient estimators to tackle this problem while allowing arbitrary number of groups.
**End-to-end Deep Learning:**with Mahdi: “End-to-end Learning of a Convolutional Neural Network via Deep Tensor Decomposition,” preprint May 2018.*TL;DR:*We propose data-efficient tensor decomposition algorithms for training convolutional neural nets (CNN). We rigorously establish a connection between low-rank tensors and CNNs and propose a multistage approach that can learn convolutional kernels of all layers simultaneously.
**Learning with Binned Regression:**with Mehrdad and Jiasi, “Learning Feature Nonlinearities with Non-Convex Regularized Binned Regression,” to appear at ISIT 2019.*TL;DR:*What to do when the dependence betweens feature and response is nonlinear? We propose a principled algorithm to learn additive models based on feature binning. Borrowing techniques from high-dimensional statistics, we show that such models can be learned with linear convergence and using near optimal amount of data. Our findings naturally highlight the role of model complexity.
## Journals**Time-Data Tradeoffs:**with Ben and Mahdi, “Sharp Time-Data Tradeoffs for Linear Inverse Problems,” IEEE Transactions on Information Theory, June 2018.*TL;DR:*We present a unified convergence analysis of the gradient projection algorithm applied to inverse problems. We**very**accurately characterize the convergence rate associated with a wide variety of random measurement ensembles in terms of the data amount and structural complexity of the model.
**Sketching sets with RIP matrices:**with Ben and Mahdi, “Isometric sketching of any set via the Restricted Isometry Property,” Information & Inference, March 2018.*TL;DR:*We show that for the purposes of dimensionality reduction certain class of structured random matrices behave similarly to random Gaussian matrices. This class includes several matrices for which matrix-vector multiply can be computed in log-linear time, providing efficient dimensionality reduction of general sets.
**Universal Laws for Sketching:**with Joel, “Universality Laws for Randomized Dimension Reduction, with Applications,” Information & Inference, 2017.*TL;DR:*Gaussianity assumption plays a central role in various data science problems. In this work, we show that a large class of random matrices behaves in a identical fashion to Gaussians for various optimization and learning problems.
**Fast nonlinear estimation:**with Mahdi, “Fast and Reliable Parameter Estimation from Nonlinear Observations,” SIAM Journal on Optimization, 2017.*TL;DR:*We show that it is possible to quickly learn nonlinear problems even if the relation between input and output (i.e. link function) is not known. The key idea is linearly approximating the unknown nonlinearities.
**Sparse Phase Retrieval:**with Kishore and Babak, “Sparse Phase Retrieval: Uniqueness Guarantees and Recovery Algorithms,” IEEE Transactions on Signal Processing, May 2017.
**Proximal denoising:**with Babak, “Sharp mse bounds for proximal denoising,” Foundations of Computational Mathematics, Oct 2015.
**Simultaneously structured models:**with Amin, Maryam, Yonina, and Babak, “Simultaneously Structured Models with Application to Sparse and Low-rank Matrices,” IEEE Trans. on Info. Theory, 2015.*TL;DR:*In several applications, the model of interest has several structures at the same time. Examples include, quadratic compressed sensing, sparse PCA, low-rank tensor completion and estimating sparse and smooth signals (fused lasso). We prove a negative result for multi-objective convex penalizations that aims to learn/estimate such signals. Our results apply under weak assumptions and demonstrate an inherent limitations of intuitive convex relaxations.
## Conferences**Compact Neural Nets:**“Learning Compact Neural Networks with Regularization”, ICML 2018.*TL;DR:*Proper regularization is critical for speeding up training, improving generalization performance, and learning cost efficient models. This work proposes and analyzes regularized gradient descent algorithms for learning shallow networks. We introduce covering dimension and show that problem becomes well conditioned and local linear convergence occurs once the amount of data exceeds the covering dimension. We also establish generalization bounds and consider convolutional and sparsity constraints as applications.
**Fast Binary Embedding:**“Near-Optimal Sample Complexity Bounds for Circulant Binary Embedding”, a shorter version appeared at IEEE ICASSP 2017.*TL;DR:*A common strategy for fast document and image retrieval is to create a binary signature of the data by embedding it into a Hamming cube. Ideally, we want a fast and space efficient embedding. We show that subsampled Fourier transform indeed provides an almost-optimal embedding strategy even after {+1, -1} quantization.
**Regularized linear regression:**with Chris and Babak, “Regularized linear regression: A precise analysis of the estimation error”, COLT 2015.*TL;DR:*We develop a general framework for precisely analyzing the tradeoffs between dataset size, model complexity, and measurement noise. Proposed framework covers sparse and low-rank priors among others and resulting performance bounds exactly coincide with numerical experiments.
**S. Oymak**and B. Hassibi, “The proportional mean decomposition: a bridge between the gaussian and bernoulli ensembles”, IEEE ICASSP 2015.X. Pan, D. Papailiopoulos, **S. Oymak**, B. Recht, K, Ramchandran, M.I. Jordan, Parallel correlation clustering on big graphs NeurIPS 2015.
**Robust Graph Clustering:**with Ramya and Babak, “Graph Clustering With Missing Data: Convex Algorithms and Analysis”, NeurIPS 2014.*TL;DR:*Graph clustering problem arises in community detection, social networking, and complex networks. Given access to noisy and incomplete connectivity information of the graph, can we identify the communities that matter (i.e. clusters)? We develop simple but precise performance bounds for convex optimization based clustering techniques. These bounds reveal intriguing phase transitions as a function of model parameters.
**S. Oymak**and B. Hassibi, “A Case for Orthogonal Measurements in Linear Inverse Problems,” IEEE ISIT 2014.C. Thrampoulidis, **S. Oymak**, and B. Hassibi, “Simple Error Bounds for Regularized Noisy Linear Inverse Problems,” IEEE ISIT 2014.R. Korlakai Vinayak, **S. Oymak**, and B. Hassibi, “Sharp Performance Bounds for Graph Clustering via Convex Optimization,” IEEE ICASSP 2014.
K. Jaganathan, **S. Oymak**, and B. Hassibi, “Sparse Phase Retrieval: Convex Algorithms and Limitations,” IEEE ISIT 2013.**S. Oymak**, C. Thrampoulidis, and B. Hassibi, “The Squared-Error of Generalized LASSO: A Precise Analysis,” short version published at Allerton 2013.**S. Oymak**, A. Jalali, M. Fazel, and B. Hassibi, “Noisy estimation of simultaneously structured models: Limitations of convex relaxation,” CDC 2013.
**S. Oymak**and B. Hassibi, “On a relation between the minimax risk and the phase transitions of compressed recovery,” Allerton 2012.**S. Oymak**, A. Khajehnejad and B. Hassibi, “Recovery Threshold for Optimal Weight L1 Minimization,” International Symposium on Information Theory (IEEE ISIT) 2012.K. Jaganathan, **S. Oymak**, and B. Hassibi, “On Robust Phase Retrieval for Sparse Signals,” Allerton 2012.K. Jaganathan, **S. Oymak**, and B. Hassibi, “Recovery of Sparse 1-D Signals from the Magnitudes of their Fourier Transform,” IEEE ISIT 2012.K. Jaganathan, , and B. Hassibi, “Phase Retrieval for Sparse Signals Using Rank Minimization,” IEEE ICASSP 2012. C. T. Li, **S. Oymak**, and B. Hassibi, “Deterministic Phase Guarantees for Robust Recovery in Incoherent Dictionaries,” IEEE ICASSP, 2012.A. K. Krishnaswamy, **S. Oymak**, and B. Hassibi, A Simpler Approach to Weighted L1 Minimization," IEEE ICASSP, 2012.
**S. Oymak**and B. Hassibi, “Tight recovery thresholds and robustness analysis for nuclear norm minimization” IEEE ISIT 2011.**S. Oymak**, A. Khajehnejad, and B. Hassibi, “Subspace Expanders and Matrix Rank Minimization,” IEEE ISIT 2011.**S. Oymak**, K. Mohan, M. Fazel, and B. Hassibi, “A Simplified Approach to Recovery Conditions for Low Rank Matrices,” IEEE ISIT 2011.Amin Khajehnejad, **S. Oymak**, and B. Hassibi, “Subspace Expanders and Fast Recovery of Low rank Matrices,” International Conference on Sampling Theory and Applications, 2011.**S. Oymak**, A. Khajehnejad, and B. Hassibi, “Improved Thresholds for Rank Minimization,” IEEE ICASSP 2011.M. Chowdhury, **S. Oymak**, A. Khajehnejad, and Babak Hassibi, “Robustness Analysis of A List Decoding Algorithm For Compressed Sensing,” IEEE ICASSP 2011.**S. Oymak**, A. Khajehnejad, and B. Hassibi, “Weighted Compressed Sensing and Rank Minimization,” IEEE ICASSP 2011.
X. Liu, **S. Oymak**, A. Petropulu, and K. R. Dandekar, “Collision Resolution Based on Pulse Shape Diversity,” Signal Processing Advances in Wireless Communications (SPAWC), 2009.
## Book chapterC. Thrampoulidis, **S. Oymak**, and B. Hassibi, “Recovering structured signals in noise: least-squares meets compressed sensing,” Compressed Sensing and its Applications, 2015.
## Technical reports**S. Oymak**, C. Thrampoulidis, and B. Hassibi, “Simple Bounds for Noisy Linear Inverse Problems with Exact Side Information”, 2013.
**S. Oymak**and B. Hassibi, “Finding Dense Clusters via Low Rank + Sparse Decomposition”, 2011.
**S. Oymak**and B. Recht, Near-optimal bounds for binary embeddings of arbitrary sets, 2015.
C. Thrampoulidis, **S. Oymak**, and B. Hassibi, “The Gaussian min-max theorem in the presence of convexity”, 2015.
**S. Oymak**and B. Hassibi, “New Null Space Results and Recovery Thresholds for Matrix Rank Minimization,” short version appeared at IEEE ISIT 2011.
## PhD ThesisSamet Oymak, “Convex Relaxation for Low-Dimensional Representation: Phase Transitions and Limitations”, California Institute of Technology, 2015. |