Hi! I'm a final-year graduate student at MIT, EECS, and CSAIL. I am very fortunate to be co-advised by Stefanie Jegelka and Jonathan Kelner. My research focuses on Deep learning, sampling, and optimization.

In Fall 2023, I was a research intern at Google, NYC, in the BigML group, working under the wise supervision of Sashank Reddi and Sobhan Miryouseffi, where I was exploringmethods to improve the training time of the Bert model. Last summer I was a research intern at Microsoft in the Foundation of Machine learning group, exploring the strange capabilities of small language models for building an embedding system for short stories. At microsoft, I was fortunate to be Advised by Ronen Eldan, Adil Salim, and Yi Zhang.

CV | Google Scholar | gatmiry@mit.edu

Publications (by topic)

    Transformers

  • On the Role of Depth and Looping for In-Context Learning with Task Diversity Khashayar Gatmiry , Zhiyuan Li, Sashank J. Reddi, Stefanie Jegelka Preprint.
  • Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning? Khashayar Gatmiry, Nikunj Saunshi, Sashank J Reddi, Stefanie Jegelka, Sanjiv Kumar ICML 2024 [paper]
  • Rethinking Invariance in In-context Learning Lizhe Fang, Yifei Wang, Khashayar Gatmiry, Lei Fang, Yisen Wang ICML Workshop 2024 [paper]
  • Implicit Bias

  • Simplicity Bias via Global Convergence of Sharpness Minimization Khashayar Gatmiry , Zhiyuan Li, Sashank J. Reddi, Stefanie Jegelka ICML 2024 [paper]
  • The Inductive Bias of Flatness Regularization for Deep Matrix Factorization Khashayar Gatmiry, Zhiyuan Li, Ching-Yao Chuang, Sashank Reddi, Tengyu Ma, Stefanie Jegelka NeuRIPS 2023 [paper]
  • The Inductive Bias of Flatness Regularization for Deep Matrix Factorization Khashayar Gatmiry, Zhiyuan Li, Ching-Yao Chuang, Sashank Reddi, Tengyu Ma, Stefanie Jegelka NeuRIPS 2023 [paper]
  • Algorithmic Generalization and Stability

  • Adaptive Generalization and Optimization of Three-Layer Neural Networks Khashayar Gatmiry, Stefanie Jegelka, Jonathan Kelner ICLR 2022 [paper]
  • On the generalization of learning algorithms that do not converge Nisha Chandramoorthy, Andreas Loukas, Khashayar Gatmiry, Stefanie Jegelka NeurIPS 2022. [paper]
  • Diffusion Models

  • Learning Mixture of Gaussians Using Diffusion Models Khashayar Gatmiry, Jonathan Kelner, Holden Lee Preprint [paper]
  • What does guidance do? A fine-grained analysis in a simple setting Muthu Chidambaram, Khashayar Gatmiry, Sitan Chen, Holden Lee, Jianfeng Lu NeuRIPS 2024 [paper]
  • MCMC Sampling

  • Sampling Polytopes with Riemannian HMC: Faster Mixing via the Lewis Weights Barrier Khashayar Gatmiry, Jonathan Kelner, Santosh S. Vempala COLT 2024. [paper]
  • Convergence of the Riemannian Langevin Algorithm Khashayar Gatmiry, Jonathan Kelner, Santosh S. Vempala to appear in JMLR [paper]
  • When does Metropolized Hamiltonian Monte Carlo provably outperform Metropolis-adjusted Langevin algorithm? Yuansi Chen, Khashayar Gatmiry Preprint [paper]
  • Online Learning/Optimization

  • Adversarial Online Learning with Temporal Feedback Graphs Khashayar Gatmiry, Jon Schneider COLT 2024 [paper]
  • Computing Optimal Regularizers for Online Linear Optimization Khashayar Gatmiry, Jon SChneider, Stefanie Jegelka Preprint [paper]
  • Projection-Free Online Convex Optimization via Efficient Newton Iterations Khashayar Gatmiry, Zakaria Mhammedi NeuRIPS 2023 [paper]
  • Quasi-Newton Steps for Efficient Online Exp-Concave Optimization Zakaria Mhammedi, Khashayar Gatmiry COLT 2023 [paper]
  • Bandit Algorithms for Prophet Inequality and Pandora's Box Khashayar Gatmiry, Thomas Kesselheim, Sahil Singla, Yifan Wang SODA 2024 [paper]
  • Other Testing/Learning

  • EM for Mixture of Linear Regression with Clustered Data Amirhossein Reisizadeh, Khashayar Gatmiry, Asuman Ozdaglar AISTATS 2024 [paper]
  • Testing Determinantal Point Processes Khashayar Gatmiry, Maryam Aliakbarpour, Stefanie Jegelka NeuRIPS 2020 [paper]
  • A Unified Approach to Controlling Implicit Regularization via Mirror Descent Haoyuan Sun, Khashayar Gatmiry, Kwangjun Ahn, Navid Azizan JMLR 2023 [paper]
  • Optimal algorithms for group distributionally robust optimization and beyond Tasuku Soma, Khashayar Gatmiry, Stefanie Jegelka Preprint [paper]
  • Information Theoretic Bounds on Optimal Worst-case Error in Binary Mixture Identification Khashayar Gatmiry, Seyed Abolfazl Motahari Preprint [paper]
  • Submodularity/Learning

  • The Network Visibility Problem Khashayar Gatmiry, Manuel Gomez-Rodriguez TOIS 2021 [paper]