top of page
Main topics
Program description
We'll start by exploring what AI systems in general look like today (foundation models) and what they might look like in the future. We will then investigate fundamental problems in alignment such as misspecification and poor generalization of objectives, some examples and how these can lead to unintended or even catastrophic results.
The next half of the course covers four techniques, which attempt to prevent misalignment and the limitations of these techniques, followed by investigations that attempt to understand machine learning systems at a deeper level, including interpretability and foundation agents.
Finally, we will cover two topics at a high level, AI governance and careers in alignment.
bottom of page