
Hi! I'm a final-year graduate student at MIT, EECS, and CSAIL. I am very fortunate to be co-advised by Stefanie Jegelka and Jonathan Kelner. My research focuses on Deep learning, sampling, and optimization.
In Fall 2023, I was a research intern at Google, NYC, in the BigML group, working under the wise supervision of Sashank Reddi and Sobhan Miryouseffi, where I was exploringmethods to improve the training time of the Bert model. Last summer I was a research intern at Microsoft in the Foundation of Machine learning group, exploring the strange capabilities of small language models for building an embedding system for short stories. At microsoft, I was fortunate to be Advised by Ronen Eldan, Adil Salim, and Yi Zhang.
CV | Google Scholar | gatmiry@mit.edu
Publications (by topic)
- On the Role of Depth and Looping for In-Context Learning with Task Diversity Preprint.
- Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning? ICML 2024 [paper]
- Rethinking Invariance in In-context Learning ICML Workshop 2024 [paper]
- Simplicity Bias via Global Convergence of Sharpness Minimization ICML 2024 [paper]
- The Inductive Bias of Flatness Regularization for Deep Matrix Factorization NeuRIPS 2023 [paper]
- The Inductive Bias of Flatness Regularization for Deep Matrix Factorization NeuRIPS 2023 [paper]
- Adaptive Generalization and Optimization of Three-Layer Neural Networks ICLR 2022 [paper]
- On the generalization of learning algorithms that do not converge NeurIPS 2022. [paper]
- Learning Mixture of Gaussians Using Diffusion Models Preprint [paper]
- What does guidance do? A fine-grained analysis in a simple setting NeuRIPS 2024 [paper]
- Sampling Polytopes with Riemannian HMC: Faster Mixing via the Lewis Weights Barrier COLT 2024. [paper]
- Convergence of the Riemannian Langevin Algorithm to appear in JMLR [paper]
- When does Metropolized Hamiltonian Monte Carlo provably outperform Metropolis-adjusted Langevin algorithm? Preprint [paper]
- Adversarial Online Learning with Temporal Feedback Graphs COLT 2024 [paper]
- Computing Optimal Regularizers for Online Linear Optimization Preprint [paper]
- Projection-Free Online Convex Optimization via Efficient Newton Iterations NeuRIPS 2023 [paper]
- Quasi-Newton Steps for Efficient Online Exp-Concave Optimization COLT 2023 [paper]
- Bandit Algorithms for Prophet Inequality and Pandora's Box SODA 2024 [paper]
- EM for Mixture of Linear Regression with Clustered Data AISTATS 2024 [paper]
- Testing Determinantal Point Processes NeuRIPS 2020 [paper]
- A Unified Approach to Controlling Implicit Regularization via Mirror Descent JMLR 2023 [paper]
- Optimal algorithms for group distributionally robust optimization and beyond Preprint [paper]
- Information Theoretic Bounds on Optimal Worst-case Error in Binary Mixture Identification Preprint [paper]
- The Network Visibility Problem TOIS 2021 [paper]