Skip to the content.

AI safety reading group

Weekly discussions of readings on technical and philosophical topics in AI safety.

AI Safety is the field trying to figure out how to stop AI systems from breaking the world, and in particular, trying to do so before they break the world. Readings will span from potential issues arising from future advanced AI systems, to technical topics in AI control, to present-day issues.

Seminar information:

Directions for joining discussions:

  1. New to metauni? Follow these instructions (part 2) to join the metauni Discord server, and introduce yourself in the channel #ai-safety.
  2. Metauni talks take place in Roblox using in-game voice chat. Follow these instructions (part 1) to create a Roblox account, complete “age verification” (unfortunately, this involves sharing ID with Roblox), and then enable Roblox “voice chat”.
  3. At the scheduled discussion time, launch the Roblox experience The Rising Sea and then walk over to the discussion area as depicted in this picture (or, just follow the people).
  4. If lost, ask for help in the Discord server, #ai-safety channel.


Completing weekly readings is recommended. We sometimes briefly summarise the paper. Usually we dive in to discussing particular credits, concerns, or confusions.

Upcoming readings and discussions:


Past readings and discussions (most recent first):

Topics brainstorm

AI safety is political philosophy complete:

Alex Turner’s work on power seeking AI

On modelling tasks with reward functions:

On impact regularisation:

Other topics:

Sources of readings (clearly with much mutual overlap):