AI Safety in 2025: Do We Need a Pivot?

The global AI landscape has just been shaken by a seismic shift with the emergence of DeepSeek-R1. This breakthrough has triggered an arms race, with Western AI companies now scrambling to accelerate development at any cost—AI safety be damned. OpenAI, Microsoft and Google DeepMind, once seen as the vanguard of responsible AI development, are now desperately playing catch-up.

The fallout has been swift. AI safety researchers, the very people tasked with keeping this technology aligned with human values, are abandoning ship. Former OpenAI researcher Jan Leike, who co-led the company’s now-defunct Superalignment team, resigned in May 2024, warning that OpenAI had de-prioritised safety in favour of “shiny products.” Just recently, Steven Adler, another safety researcher left OpenAI, called the company’s pace of AI development "terrifying."

The underlying message? The AI safety conversation is being drowned out by a technological arms race. Western firms, once vocal about AI alignment, are now fixated on speed and market dominance—reacting not to ethical concerns but to geopolitical competition.

The Global AI Safety Landscape Is A Fractured System

In theory, AI safety alignment is a priority for governments, researchers, and tech leaders. In practice, it has become a scattered, reactive effort, with no global consensus or enforcement mechanisms. Here’s a snapshot of the current landscape:

1. Major AI Safety Players (And Their Growing Irrelevance)

OpenAI, DeepMind, Anthropic, and ARC (Alignment Research Center) have led the charge in AI safety, but internal turmoil suggests their influence is waning. Anthropic seem to be the most safety conscious with their focus on mechanistic interpretability.

OpenAI’s Superalignment initiative was supposed to tackle existential AI risks but was quietly disbanded in late 2024. OpenAI have however deployed deliberative alignment in their o-series models so AI outputs closer alignment to human values, but this seems shallow.

The "alignment tax"—the idea that safety slows down progress—is now seen as a competitive disadvantage.

2. Key AI Safety Frameworks: Underfunded and Ignored

Reinforcement Learning from Human Feedback (RLHF) was once the gold standard for AI alignment, but it is increasingly seen as insufficient for steering advanced models which show stronger, more independent reasoning. This is a great cause of concern. Cooperative AI and Constitutional AI offer more robust solutions, but their real-world implementation is years behind capabilities research.

3. Government and International Efforts: Too Slow to Matter

The EU AI Act is the most comprehensive attempt to regulate AI, yet its provisions on high-risk AI systems won’t take effect until at least 2026. Another issue is that it only relates to Europe, and yet the companies competing for top spot fall outside this jurisdiction.

The US AI Executive Orders were largely voluntary, and companies could opt out of meaningful safety measures in the name of “innovation.” Since Trump came into power, the Executive order has been rescinded double-downing on a pro-innovation approach.

China’s AI regulations, despite their strictness, are primarily focused on political control rather than alignment with human values.

The UK’s AI Opportunities Action Plan outlined a “pro-innovation” approach with reliance on existing frameworks and relations to guide it’s AI development. While this seems passive, the functionality of an “AI sovereignty” would demand more active governance.

4. Ethics vs. Market Forces: A Losing Battle

Philosophers and ethicists have spent years debating AI alignment, but the discussion is now being sidelined by raw competition. The fundamental question remains unanswered: Who decides what values AI should align with? We’ve already seen OpenAI’s move away from politically unbiased models.

5. Industry Collaborations and Open-Source AI: A Double-Edged Sword

The Partnership on AI, once a promising multi-stakeholder initiative, has been largely ineffective in curbing the AI race. Open-source AI models, while democratising AI development, risk making unaligned models more accessible to bad actors.

AI Safety is Losing the War

Despite the billions poured into AI safety research, there are three major challenges preventing real progress:

  1. The illusion of control: AI labs like OpenAI and DeepMind once claimed they would slow down if safety risks became too high. That has not happened. Instead, they are ramping up development due to competition from DeepSeek.

  2. Geopolitics and the AI cold war: AI is no longer just about innovation—it’s about power. Governments see AI dominance as a matter of national security, making safety concerns secondary.

  3. The absence of a global AI watchdog: Unlike nuclear weapons, AI has no international oversight body with real enforcement power. Regulators are reacting to AI’s progress, not shaping it.

How Can AI Safety Be Saved?

Despite the bleak outlook, some measures could still prevent AI alignment from becoming an abandoned afterthought:

  • A Global AI alignment treaty: Governments must treat AI like nuclear technology, creating binding international agreements on safety standards and deployment controls.

  • Mandatory AI safety research in the west: AI companies should be required to invest a fixed percentage of revenue into independent safety research, with enforceable transparency measures.

  • Real AI development llowdowns: AI labs could set hard limits on training compute for new models until alignment research catches up with capabilities. This is documented within Anthropic’s Responsible Scaling Plan.

  • Public awareness and pressure: AI safety must become a mainstream political issue, forcing governments to prioritise oversight over profit-driven acceleration.

The Clock is Ticking

The narrative around AI safety has dramatically shifted. Once seen as a guiding principle, it is now little more than a PR exercise for companies racing to maintain their lead. The DeepSeek panic has shattered the illusion that AI alignment would naturally keep pace with AI capabilities.

We now face a critical choice: Do we continue down a path where AI safety is dictated by market forces, or do we demand a new paradigm—one that prioritises long-term alignment over short-term profit?

The answer to this question may define the future of intelligence—both human and artificial.