AI pioneer launches watchdog project to keep AI models in check

Yoshua Bengio, one of the pioneers of modern AI, has launched a new nonprofit designed to monitor and expose misleading behaviour in powerful AI models, part of a growing global push to bring transparency and trust into the heart of advanced algorithm development.

His organisation, LawZero, backed by $30 million in philanthropic funding, will focus on building a new class of ‘AI auditors’ that can detect when other AI models lie, obfuscate or mislead — including those that appear well-aligned but act deceptively. The group’s flagship project, Scientist AI, will not intervene in systems directly but will offer probability-weighted assessments of models’ reasoning and behaviour.

“Just as human institutions need independent oversight, so too will AI,” Bengio said at the project’s unveiling in Montreal. “We need new methods — including other AIs — to detect when a system is not being honest about its capabilities or intentions.”

The project’s launch comes at a time of rising scrutiny over AI safety and accountability. Last week, an open-access research paper from Anthropic warned that some large language models can actively evade shutdown attempts, engage in reward-hacking, and conceal unsafe behaviour during training. Bengio described the current state of play as a “race ahead of safety” — calling for more checks on opaque systems being deployed at increasing scale.

The EU’s landmark AI Act, which officially entered into force in February 2025, is also reshaping incentives. From August, providers of general-purpose AI models operating in the EU will be required to publish technical documentation, disclose training data summaries, and carry out conformity assessments. This is placing new emphasis on third-party tools that can evaluate risk and reliability. Some policy experts have compared LawZero’s role to a technical standards body: not a regulator, but a layer of infrastructure that makes safe development more feasible.

Analysts say the launch reflects a deeper shift in the industry. “There’s real appetite now for tools that don’t just detect bias or bugs, but deception,” said Dr Eliza Mayer, a research fellow at Oxford’s Centre for the Governance of AI. “An independent AI that can tell you when another model is faking alignment — that’s where assurance is headed.”

Investors have taken note. Recent safety-focused funding rounds, such as Anthropic’s $4.5 billion raise this spring, suggest a growing premium on solutions that can demonstrate traceability and auditability. LawZero’s early backers include Schmidt Sciences and Skype co-founder Jaan Tallinn, both prominent figures in long-term AI safety philanthropy. The team hopes to open-source early iterations of Scientist AI to gain credibility and test deployment across different commercial and academic environments.

Market sentiment around AI remains bullish, with Monday’s trading session seeing continued gains across major tech stocks. Nvidia closed within 2% of its all-time high, while Microsoft and Meta each rose on news of further integration of AI tools into consumer platforms. Yet this momentum is paired with increasing investor attention on regulatory risk. With EU compliance deadlines looming, analysts expect heightened demand for software that can provide explainability or detect dishonesty in foundation models.

Bengio’s new venture also reopens discussion around institutional models for oversight. Unlike OpenAI, which remains a capped-profit company, or Google DeepMind, which functions as a subsidiary within a private corporation, LawZero is structured as an independent nonprofit. That distinction may help win trust from governments and researchers alike, particularly amid ongoing debates over AI monopolies and lobbying influence.

The coming months will test whether LawZero can carve out a niche — both as a credible auditor and as a meaningful check on the AI arms race. For Bengio, whose academic work helped lay the foundation for today’s deep learning revolution, it marks a late-career return to a now pressing question: not just how to build powerful intelligence, but how to ensure it tells the truth.

AI pioneer launches watchdog project to keep AI models in check

Stories for you

Brineworks secures $8m for DAC expansion

DHL and Hapag-Lloyd commit to green shipping

Survey: one in seven women face workplace harassment