Singular Learning Theory: How phase transitions in the loss landscape can unlock deeper understanding of model dynamics and robustness.
Developmental Interpretability: Advancing and applying tools provide further insight into model behavior.
Model Biology & Feature Representations: The Linear Representation Hypothesis and feature representations across layers.