Alignment Research Fellowship (ARF)

As AI systems become increasingly capable, there's a growing need for technical talent in AI safety research. Many aspiring researchers find themselves asking

"What's the next step after learning about AI safety fundamentals?"

This program provides structured training in technical AI safety through hands-on implementation of key machine learning concepts and alignment research techniques and is based on the ARENA (Alignment Research Engineer Accelerator) curriculum.

Program Details:

We are excited to announce our online technical upskilling pilot program in collaboration with the European Network for AI Safety (ENAIS), and AI Safety Hungary. You'll master key technical concepts in AI safety through structured co-working sessions and hands-on implementation, with most of your time spent under the guidance of experienced Teaching Assistants (TAs).

The program runs online starting on the week of the 22nd of March and includes:

Two 3-hour co-working sessions per week with guidance from experienced TAs
4 hours of structured independent study per week
Small group sessions with a dedicated TA
Ongoing support via Slack
Total commitment: 10 hours per week for 14 weeks from March to June 2025

The program is completely free to attend. We provide:

Teaching Assistants
Compute credits for projects
Small group collaboration opportunities

Curriculum Structure

Based on the ARENA materials, you'll work through:
Building a transformer from scratch and understanding attention mechanisms
Deep learning fundamentals and PyTorch implementation
Mechanistic interpretability of transformer models
Introduction to Reinforcement Learning and RLHF
LLM evaluations: designing benchmarks and building evaluation frameworks to assess model capabilities

Express your interest!

FAQ

Who should apply?

We're looking for participants who are new to mechanistic interpretability and eager to develop technical skills in AI safety. Basic familiarity with:

Python Programming
Linear algebra and probability
AI safety concepts (e.g., completion of AI Safety Fundamentals or equivalent)

Even if you don't meet all these requirements but are passionate about AI safety and have a technical background, we encourage you to apply.

What are the dates?

This program starts on the week of the 22nd of March 2025 and will be 14 weeks long.

What should I do if I don't have enough ML or AI Safety experience?

We encourage you to check out our resources page where we have several resources that can help you upskill in several areas!