Back to All Events

AGI Safety Fundamentals

  • Online & around TUD campus (map)

This program dives deep into the risks posed by advanced Artificial Intelligence. We talk about current progress in AI, the problems that need to be solved to make sure AI Systems are safe and how to align AI with human values!

We cover questions such as: How can we teach AI to behave ethically? How do we make sure AI follows the intent of it’s creators? How can you test whether an AI is safe to deploy? What is the state-of-the-art in AI and how will it progress in the next years?

Together we will go through the curriculum created by AI Alignment researcher Richard Ngo: https://www.agisafetyfundamentals.com/ai-alignment-curriculum. The first seven weeks are split into 1.5h of reading about the problem and 1.5h of discussing the contents with other interested students. In the remaining 4 weeks you get to pick your own mini-project to develop your skills and knowledge in the field.

Week 1: Artificial General Intelligence

Week 2: Reward misspecification and foundation models

Week 3: Goal mis generalization and instrumental convergence

Week 4: Inverse Reinforcement Learning and Iterated Amplification

Week 5: Debate and unrestricted adversarial training

Week 6: Interpretability

Week 7: Agent foundations, AI governance, and careers in alignment

Week 8-11: You Project

After completing this program you will have a deep understanding of the area and will be able to apply this to help solve one of the worlds most pressing problems!

No programming knowledge is required, however if you are less familiar with concepts in Machine Learning you can prepare with Week 0: Introduction to Machine Learning

Applications are now open!

Previous
Previous
May 1

Introduction to AI Safety Event

Next
Next
September 27

OpenAI Talk + Q&A