Modern AI systems are a transformative technological advancement, but many of their most important behaviors still lack simple organizing principles. In this talk, I will argue that physics offers a useful language for building effective theories of learning: identifying the right observables, isolating minimal solvable models, and understanding which features of data and learning dynamics are universal.
I will first discuss the structure of natural data. I will describe how tools from statistical physics and random matrix theory reveal universal structure in complex datasets. I will then show how diffusion models can be used as probes of hierarchical compositional structure: by partially noising and denoising data, one can expose latent features at different depths, observe a semantic phase-transition, and begin to reconstruct the organization of the data itself. I will next turn to scaling laws and reasoning. I will describe simple models in which test-time inference scaling can be understood by the lens of the effective difficulty of the questions beind asked.
Finally, I will discuss how these ideas connect to AI for physics, namely, ensuring that AI models are truly learning underlying physical laws rather than pattern matching to heuristics.
Itay M Bloch