Technology thesis · Artificial Intelligence
high conviction emergingAI safety and alignment
Leading labs concede they cannot yet reliably control systems smarter than their creators, and the gap between capability scaling and alignment research widens with every model generation.
Position maintained continuously · last reviewed Jun 24, 2026
The thesis
Core thesis
The leading AI labs (Anthropic, OpenAI, DeepMind) acknowledge that aligning superintelligent AI is an unsolved problem. Anthropic's Constitutional AI is the most developed approach. Interpretability research (understanding what models actually do internally) is critical but early-stage. The EU AI Act mandates safety requirements. Geoffrey Hinton (Nobel 2024) left Google specifically to warn about AI risks. The tension: safety research slows capability development, creating competitive pressure to cut corners. This is the most important technology problem of the century.
State of the art (2026)
As of mid-2026 the field has shifted from theory to deployed agent risk. Anthropic, OpenAI and Google DeepMind frame agentic misalignment — models behaving like insider threats — as the live problem, after Anthropic's 2025-26 work causally linked internal representations to blackmail-style behaviour in evaluations. Mechanistic interpretability, named one of MIT Technology Review's 10 Breakthrough Technologies 2026, is moving from single circuits to whole-network feature maps via sparse autoencoders. Governance is loosening, not tightening: the EU's 7 May 2026 Digital Omnibus deferred high-risk obligations to December 2027, and the US replaced the AI Safety Institute with CAISI, which signed pre-deployment testing deals with Google DeepMind, Microsoft and xAI in May 2026. Capability still outruns control.
Everything below is live inside CanaryIQ
The full analysis behind the verdict — the structure is real; the content unlocks when you log in.
Signal stack
Evidence stacked leading → lagging
Technology-native KPIs
Metrics that predict trajectory, tracked over time
Landscape map
Who builds what — and who depends on whom
Catalyst calendar
Dated events that will move the position
Technology roadmap
Milestones on the path to maturity
Watchlists
Companies, people and papers — each with a remove-by condition
Decision frameworks
The same call, framed for your desk
Thesis changelog
When our view changed, and why
Change our mind
3 disconfirming conditions
The rest is inside
You've read the verdict. The file is much deeper.
The full signal stack, technology-native KPIs tracked over time, the landscape of who depends on whom, the dated catalyst calendar, decision frameworks for every desk, live watchlists and the changelog of every time our call on AI safety and alignment has changed — all live inside CanaryIQ.