AI safety reading group
Weekly discussions of readings on technical and philosophical topics in AI safety.
AI Safety is the field trying to figure out how to stop AI systems from breaking the world, and in particular, trying to do so before they break the world. Readings will span from potential issues arising from future advanced AI systems, to technical topics in AI control, to present-day issues.
- Organisers: Matthew Farrugia-Roberts and Dan Murfet.
- Time: Thursday evenings, 9pm AEST, most weeks (see home page for most up-to-date schedule).
- Venue: The Rising Sea.
Directions for joining discussions:
- New to metauni?
Follow these instructions (part 2)
to join the metauni Discord server, and introduce yourself in the channel
- Metauni talks take place in Roblox using in-game voice chat. Follow these instructions (part 1) to create a Roblox account, complete “age verification” (unfortunately, this involves sharing ID with Roblox), and then enable Roblox “voice chat”.
- At the scheduled discussion time, launch the Roblox experience The Rising Sea and then step into matomatical’s portal (bottom-right corner of stack, see picture), or use the menu: “Pockets” > “Go to pocket” > type address “Gemini Pulsar 1”.
Completing weekly readings is recommended, but ultimately optional. The discussion sessions begin with a summary of the reading, lead by Matt (unless otherwise noted).
Upcoming readings and discussions:
2022.09.22: break week
2022.09.29: Eliezer Yudkowsky, 2013, “Intelligence explosion microeconomics”, MIRI technical report.
Past readings and discussions:
2022.06.09: Norbert Wiener, 1960, “Some moral and technical consequences of automation”, Science.
2022.06.16: Stephen M. Omohundro, 2008, “The basic AI drives”, Proceedings of the 2008 conference on Artificial General Intelligence.
2022.06.23: Nick Bostrom, 2012, “The superintelligent will: Motivation and instrumental rationality in advanced artificial agents”, Minds and Machines.
2022.06.30: Rachel Thomas and Louisa Bartolo, 2022, “AI harms are societal, not just individual”, fast.ai blog. Discussion lead by Dan.
2022.07.21: Tobias Wängberg et al., 2017, “A game-theoretic analysis of the of the off-switch game”, AGI 2017.
2022.08.04: Scott Garrabrant et al., 2017, “Logical induction”, arXiv. Discussion lead by Dan. Note: there is an updated 2020 version on arXiv.
2022.08.25: an original presentation by Matt about compression and learning in models of computation embedded in the real world.
2022.09.08: discussion of reading group direction.
2022.09.15: Nate Soares, 2022, “On how various plans miss the hard bits of the alignment challenge”, lesswrong.
AI safety is political philosophy complete
- Exernalities correspond to market alignment failures. How do we handle them? Overcome them? Do we face risk from them? Would these risks be exacerbated by
- How can we live in the midst of complex systems we don’t understand, and can’t fully control, like civilisation, capitalism, etc.?
- What other literatures could help us here?
- AI governance
- Is there literature on technology and society?
- Specific safety proposals
Sources of readings (clearly with much mutual overlap):