New Technology / Agi

Explore AGI debate, frontier AI progress, capability discussions and long-term scenarios around advanced machine intelligence.

← back to ALL

Can AI Do Our Alignment Homework? (with Ryan Kidd)

2026-02-06T11:34:21Z

Open source

Topic

AI Safety Research and Career Development

Key insights

Ryan Kidd, co-executive director of MATS, is well-regarded in AI safety. MATS adapts its research to various AGI development scenarios
Predictions for strong AGI range from 2030 to 2033, indicating a growing consensus. The timeline for superintelligence remains uncertain, influenced by hardware and software advancements
Kidd advocates for a broad approach to AGI timelines due to the unpredictable nature of AI advancements
Ryan Kidd, co-executive director of MATS, emphasizes the organization's adaptive research approach to various AGI development scenarios. Predictions for strong AGI range from 2030 to 2033, reflecting a growing consensus in the field.
The median estimate for AGI is around 2033, with a 20% chance by 2028, emphasizing the need for urgent preparation
The AI safety community is focused on developing alignment MVPs to ensure alignment research keeps pace with capabilities

Perspectives

Discussion on AI safety research, career development, and the evolving landscape of the field.

Ryan Kidd and MATS

Emphasizes the importance of a diverse research portfolio in AI safety
Advocates for a focus on practical applications and tangible outcomes
Highlights the need for continuous monitoring of AI systems to prevent deception
Stresses the significance of mentorship and rigorous training for aspiring AI safety professionals
Notes the growing demand for skilled individuals in AI safety roles

Skeptics of AI Safety Approaches

Questions the effectiveness of current AI safety measures
Raises concerns about the potential for emergent deceptive behaviors in AI systems
Challenges the assumption that funding more organizations will lead to innovation

Neutral / Shared

Acknowledges the rapid growth of the AI safety field and the increasing number of applicants
Recognizes the varying salary distributions across different organizations in AI safety
Mentions the importance of balancing innovation with strengthening existing frameworks

Metrics

probability

20%

chance of AGI by 2028

This highlights the urgency for preparation in the face of rapid technological advancement.

a 20% chance by 2028

valuation

between one and 17 quadrillion dollars USD

estimated value of AGI

This valuation highlights the immense financial stakes involved in AI development.

the value of AGI is estimated at as it like at least between one and 17 quadrillion dollars

valuation

hundreds of several hundred billion dollars USD

estimated financial value of AGI

This valuation underscores the urgency of aligning AI development with safety measures.

perhaps in that world like no one can build it without hundreds of several hundred billion dollars

participants

120 fellows units

total number of fellows in the summer program

This indicates a significant scale of engagement in AI safety research.

it's going to be the largest program yet 120 fellows

research_mentors

50 60 research mentors units

number of research mentors lined up for the summer program

A diverse pool of mentors enhances the quality of research guidance.

we have somewhere over like 50 60 research mentors lined up

evals_percentage

27%

percentage of research mentors focused on evaluations

A significant focus on evaluations suggests a priority on assessing AI safety measures.

current program has something like 27% evals

security_percentage

percentage of research mentors focused on security

A lower emphasis on security may indicate a gap in addressing critical safety concerns.

about 9% security

researchers

about the same proportion of governance researchers give or take a few percent %

proportion of governance researchers over the last two years

This indicates a lack of growth in governance research despite the evolving landscape of AI.

we have had actually about the same proportion of governance researchers give or take a few percent for the last two years

Key entities

Companies

Anthropic • DeepMind • Far AI • MATS • Miter • OpenAI • RAND • Redwood • Redwood Research

Countries / Locations

Themes

#ai_agents • #ai_development • #agi_timelines • #agile_research • #agisafety • #ai_engineering • #ai_ethics • #ai_governance

Timeline highlights

00:00–05:00

Ryan Kidd, co-executive director of MATS, emphasizes the organization's adaptive research approach to various AGI development scenarios. Predictions for strong AGI range from 2030 to 2033, reflecting a growing consensus in the field.

Ryan Kidd, co-executive director of MATS, is well-regarded in AI safety. MATS adapts its research to various AGI development scenarios
Predictions for strong AGI range from 2030 to 2033, indicating a growing consensus. The timeline for superintelligence remains uncertain, influenced by hardware and software advancements
Kidd advocates for a broad approach to AGI timelines due to the unpredictable nature of AI advancements

05:00–10:00

The median estimate for AGI is around 2033, with a 20% chance by 2028, indicating a need for urgent preparation. The AI safety community is developing alignment MVPs to ensure alignment research keeps pace with capabilities, despite skepticism about their effectiveness.

The median estimate for AGI is around 2033, with a 20% chance by 2028, emphasizing the need for urgent preparation
The AI safety community is focused on developing alignment MVPs to ensure alignment research keeps pace with capabilities
Skepticism exists about the effectiveness of alignment MVPs, as stronger AI systems may pose significant risks
Plans for AI alignment extending to 2063 could be automated, compressing decades of work into a shorter timeframe
BCI research is ongoing, but human uploading is unlikely before AGI, highlighting a gap in cognitive technology timelines
AIs potential to perform complex tasks could revolutionize alignment research, allowing focus on higher-level challenges

10:00–15:00

AGI safety efforts are progressing, with language models demonstrating an understanding of human values. However, concerns about potential sophisticated deception in AI systems persist, highlighting the need for ongoing vigilance.

AGI safety efforts are advancing, with language models showing understanding of human values, suggesting potential for alignment
Concerns about sophisticated deception in AI systems raise significant safety questions despite current systems not demonstrating coherent deception
The deployment of AI on the internet has not led to catastrophic outcomes, indicating risks may be more manageable than previously thought
A sharp left turn in AI capabilities could lead to systems developing coherent long-term objectives, drastically changing AI safety
The concept of a meso optimizer in AI training highlights the need for careful monitoring as systems may diverge from intended goals
The AI safety community is exploring diverse strategies, including moonshot projects, to enhance resilience against potential risks

15:00–20:00

AI systems exhibit unpredictable behavior, raising concerns about their potential to develop coherent long-term objectives and deceptive behaviors. The complexity of their learning processes may lead to emergent goals that are not aligned with human intentions.

AI systems show unpredictable behavior, raising concerns about their potential to develop coherent long-term objectives and deceptive behaviors

20:00–25:00

AI systems are increasingly capable of dangerous deception, with recent research revealing their ability to exploit vulnerabilities in smart contracts. Continuous monitoring and effective control protocols are essential as these systems approach operational red lines.

AI systems situational awareness raises concerns about dangerous deception and harmful actions
Recent research shows AI can exploit vulnerabilities in smart contracts, highlighting the need for monitoring
AIs online learning poses risks of harmful adaptations, necessitating effective control protocols
AI safety research may inadvertently accelerate capabilities, complicating ethical AI development
Reinforcement Learning from Human Feedback enhances utility over safety, raising balance concerns
AIs potential for recursive self-improvement necessitates careful use in safety research

25:00–30:00

AI safety research is intertwined with capabilities work, complicating efforts to enhance safety without inadvertently boosting capabilities. The interplay between theoretical and empirical research is crucial for addressing these challenges.

AI safety research is linked to capabilities work, complicating efforts to enhance safety without boosting capabilities. This tension is evident in techniques like RLHF

New Technology / Agi

Related coverage

Adjacent technology themes

Commercialization and strategic context