Mountain Banner
I’m Jyo Pari, a PhD student at MIT, studying how models can continually learn through advances in architecture and optimization.
▸ Alongside writing educational posts, I’ll also share more refined brainstorming articles as part of an open-science effort to encourage collaboration. If any idea resonates with you and you’d like to explore it further, feel free to reach out!
▸ Also, check out Scale-ML a student led MIT organization focused on scaling in deep learning

Recurrent Networks and Test Time Training (TTT)

Notes on some papers that study how recurrent models are doing a form of TTT....

Feb 1, 2025 · 23 min ·  Author:  Jyo Pari   |   Editor:  N/A

Towards Self-Editing Models: Part 1

Models that self-update, refine, and grow in complexity over time introduce unique possibilities and challenges....

Jan 19, 2025 · 23 min ·  Author:  Jyo Pari   |   Editor:  N/A

Model Merging

Techniques and challenges in merging multiple machine learning models into a cohesive system...

Mixture of Experts (MoE)

Notes on some papers that study MoEs...

Fall, 2023 · 23 min ·  Author:  Jyo Pari   |   Editor:   Minyoung (Jacob) Huh

Distances Between Subspaces

Grassman Metric

Symmetries in Neural Networks

Understanding how symmetry in networks can improve optimization...

Discrete Optimal Transport

An exploration of discrete optimal transport methods and their applications in machine learning...

Euler-Lagrange Equation

Euler-Lagrange equation and its significance in calculus of variations...

Gumbel Softmax

An overview of the Gumbel-Softmax distribution and its utility in differentiable sampling...

Langevin Sampling

Exploring Langevin dynamics as a method for sampling from complex distributions...

Centered Kernel Alignment

How can we compare representations between networks...