Singular Learning Theory seminar

This is the homepage of a seminar on Singular Learning Theory (SLT), a theory applying algebraic geometry to statistical learning theory founded by Sumio Watanabe. The seminar takes place at metauni. For applications of SLT to alignment see the initial document and developmental interpretability.

The canonical references are Watanabe’s two textbooks:

The gray book: S. Watanabe “Algebraic geometry and statistical learning theory” 2009.
The green book: S. Watanabe “Mathematical theory of Bayesian statistics” 2018.

Some other introductory references:

Liam Carroll’s sequence: Distilling Singular Learning Theory.
Spencer Wong’s MSc thesis, May 2022, From Analytic to Algebraic: The Algebraic Geometry of Two Layer Neural Networks.
Liam Carroll’s MSc thesis, October 2021, Phase transitions in neural networks.
Tom Waring’s MSc thesis, October 2021, Geometric Perspectives on Program Synthesis and Semantics.
S. Wei, D. Murfet, M. Gong, H. Li , J. Gell-Redman, T. Quella “Deep learning is singular, and that’s good” 2022.
Shaowei Lin’s PhD thesis, 2011, Algebraic Methods for Evaluating Integrals in Bayesian Statistics.

2023

19-1-23 (Russell Goyder): Physical entropy vs information-theoretic entropy Pt 2 (video, pocket, transcript, notes)
26-1-23 (Dan Murfet): Towards in-context learning in SLT Pt 2 (video, pocket, transcript)
2-2-23 (Dan Murfet): Towards in-context learning in SLT Pt 3 (video, pocket, transcript)
9-2-23 (Dan Murfet): Solid state physics and SLT Pt 1 (video, notes, pocket, transcript)
16-2-23 (Dan Murfet): Solid state physics and SLT Pt 2 (video, notes, pocket, transcript)
23-2-23 (Edmund Lau): Variational Bayesian posterior and learning resolutions (video, pocket, transcript)
2-3-23 (Samuel Jolly): Toy models of superposition and SLT (video, pocket, transcript)
9-3-23 (Dan Murfet): SLT and Alignment Pt 1 (video, pocket, transcript)
16-3-23 (Edmund Lau): Occam’s razor following Balasubramanian (video, pocket, transcript)
23-3-23 (Ben Gerraty): Toy models of superposition Pt 2 (video, pocket, transcript
30-3-23 (Dan Murfet): SLT and Alignment Pt 2 (video, notes, pocket, transcript)
6-4-23 (Rohan Hitchcock): Induction heads (video, pocket, transcript)
13-4-23 (Eric Michaud): The Quantization Model of Neural Scaling (paper)
20-4-23 (Zhongtian Chen): Jet schemes of monomial ideals (video)
27-4-23 (Dan Murfet): Primer planning session (video, pocket)
25-5-23 (Neel Nanda): Mechanistic Interpretability (video)
20-7-23 (Dan Murfet): The Research Agenda (video)
27-7-23 (Dan Murfet): Intro to Developmental Biology (video)
10-8-23 (Ben Gerraty): Hamiltonian Monte Carlo and the SLT Hamiltonian
17-8-23 (Dan Murfet): In-context learning and implicit Bayesian inference (paper, video)
24-8-23 (Jesse Hoogland): “Saddle-to-saddle dynamics in deep linear networks” A. Jacot et al 2021. (video)
31-8-23 (Edmund Lau): Quantifying degeneracy in singular models via the learning coefficient (video)
6-9-23 (Dan Murfet): Research updates (video)
14-9-23 (Arthur Conmy): Automated circuit discovery (paper)
21-9-23 (Nisch): “A mathematical theory of semantic development in deep neural networks” A. Saxe, J. McClelland, S. Ganguli 2019 (video)
5-10-23 (Matt Farrugia-Roberts): “GPT on a TPU VM with PyTorch/XLA, or: How I Learned to Stop Worrying and Love the Brrr” (workshop, video)
12-10-23 (Dan Murfet, Edmund Lau) Bayesian versus Dynamical Transitions in a Toy Model of Superposition (video)
14-12-23 (Dan Murfet) Complexity Beyond Parameter Counting, error correction and the RLCT (video)

2022

Below you can find the seminars for 2022, with videos and pocket links (which take you to the virtual world where the talk took place, with the blackboards just as we left them at the end of the talk).

13-1-22 (Dan Murfet): What is learning? Singularities and pendulums (video, transcript).
13-1-22 (Edmund Lau): The Fisher information matrix (video, transcript).
20-1-22 (Edmund Lau): Fisher information, KL-divergence and singular models (video, transcript).
20-1-22 (Liam Carroll): Markov Chain Monte Carlo (video, transcript).
27-1-22 (Liam Carroll): Neural networks and the Bayesian posterior (video, transcript)
27-1-22 (Spencer Wong): Rings, ideals and the Hilbert basis theorem (video, transcript).
3-2-22 (Spencer Wong): From analytic to algebraic I (video, transcript).
3-2-22 (Ken Chan): Resolution of singularities (video, transcript).
10-2-22 (Dan Murfet): Introduction to density of states (video, notes, transcript).
10-2-22 (Spencer Wong): Polynomial division (video, transcript).
17-2-22 (Spencer Wong): From analytic to algebraic II (video, transcript).
17-2-22: Working session 1 (video, transcript).
24-2-22 (Edmund Lau): Free energy asymptotics (video, transcript)
24-2-22: Working session 2 (video, transcript)
3-3-22 (Spencer Wong): From analytic to algebraic III (video, transcript).
3-3-22: Working session 3 (video, transcript).
10-3-22 (Tom Waring): Regularly parametrised models (video, transcript).
17-3-22 (Edmund Lau): Bounding the partition function (video, transcript).
24-3-22 (Edmund Lau): The influence of sampling (video, transcript).
7-4-22 (Edmund Lau): Main Theorem 1 (video, transcript).
14-4-22 (Edmund Lau): Main Theorem 2 (video, transcript).
8-9-22 (Matt Farrugia-Roberts): Complexity of rank estimation (video, pocket).
15-9-22 (Matt Farrugia-Roberts): Piecewise-linear paths in equivalent networks (video, pocket).
22-9-22 (various) A minimal introduction to the geometry of tanh networks (video, pocket, transcript).
29-9-22 (Dan Murfet): Information theory I - entropy and KL divergence (video, pocket, transcript).
6-10-22 (Zhongtian Chen): The Kraft-McMillan theorem (video, pocket, transcript).
13-10-22 (Edmund Lau): Asymptotic learning curve and renormalizable condition in statistical learning theory (video, pocket, transcript).
13-10-22 (Dan Murfet): Intro to blowing up (cross-posted from the Abstraction seminar, video, pocket).
20-10-22 (Dan Murfet): State of scaling laws 2022 (video, pocket, transcript).
27-10-22 (Dan Murfet): In-context learning (video, pocket, transcript).
3-11-22 (Dan Murfet): Open problems (video, pocket, transcript).
10-11-22 (Edmund Lau): Newton diagrams in singular learning theory (video, pocket, transcript).
17-11-22 (Matt Farrugia-Roberts): Overview of MSc thesis (video, pocket).
24-11-22 (Dan Murfet): Jet schemes I (video, pocket, transcript).
1-12-22 (Matt Farrugia-Roberts): Overview of MSc thesis Pt 2 (video, pocket).
8-12-22 (Dan Murfet): Jet schemes II (video, pocket, transcript).
15-12-22 (Matt Farrugia-Roberts): Overview of MSc thesis Pt 3 (video, pocket).
22-12-22 (Russell Goyder) Physical entropy vs information-theoretic entropy (video, pocket, transcript, notes).

shot

Background

Some rough handwritten notes:

Deep Learning Theory 1: Why deep learning theory?
Deep Learning Theory 2: Thermodynamics of Singular Learning Theory
Deep Learning Theory 3: Phase transitions
Singular Learning Theory 4: Local RLCT
Singular Learning Theory 5: Symmetry and RLCT
Singular Learning Theory 6: Generalisation and Power Laws
Singular Learning Theory 8: Calculations for feedforward networks
Singular Learning Theory 12: Density of states
Singular Learning Theory 13: Asymptotics of the free energy