‘Godfather’ of AI Yoshua Bengio says latest models lie to users

Unlock the Editor’s Digest for free

One of the “godfathers” of artificial intelligence has attacked a multibillion-dollar race to develop the cutting edge technology, arguing the latest models are displaying dangerous characteristics such as lying to users.

Yoshua Bengio, a Canadian academic whose work has informed techniques used by top AI groups such as OpenAI and Google, said: “There’s unfortunately a very competitive race between the leading labs, which pushes them towards focusing on capability to make the AI more and more intelligent, but not necessarily put enough emphasis and investment on research on safety.”

The Turing Award winner issued his warning in an interview with the Financial Times, while launching a new non-profit called LawZero. He said the group would focus on building safer systems, vowing to “insulate our research from those commercial pressures”.

LawZero has so far raised nearly $30mn in philanthropic contributions from donors including Skype founding engineer Jaan Tallinn, former Google chief Eric Schmidt’s philanthropic initiative, as well as Open Philanthropy and the Future of Life Institute.

Many of Bengio’s funders prescribe to the “effective altruism” movement, whose supporters tend to focus on catastrophic risks surrounding AI models. Critics argue the movement highlights hypothetical scenarios while ignoring current harms, such as bias and inaccuracies.

Bengio said his not-for-profit group was founded in response to growing evidence over the past six months that today’s leading models are developing dangerous capabilities. This includes showing “evidence of deception, cheating, lying and self-preservation”, he said.

Anthropic’s Claude Opus model blackmailed engineers in a fictitious scenario where it was at risk of being replaced by another system. Research from AI testers Palisade last month showed that OpenAI’s o3 model refused explicit instructions to shut down.

Bengio said such incidents were “very scary, because we don’t want to create a competitor to human beings on this planet, especially if they’re smarter than us”.

The AI pioneer added: “Right now, these are controlled experiments [but] my concern is that any time in the future, the next version might be strategically intelligent enough to see us coming from far away and defeat us with deceptions that we don’t anticipate. So I think we’re playing with fire right now.”

The ability for systems to assist in building “extremely dangerous bioweapons” could be a reality as soon as next year, he added.

Based in Montreal, LawZero currently employs 15 people and aims to hire more technical talent to build the next generation of AI systems designed for safety.

Bengio, a professor of computer science at the University of Montreal, will step down as scientific director at Mila, the Quebec Artificial Intelligence Institute, to focus on the new organisation.

It aims to develop an AI system that will give truthful answers based on transparent reasoning instead of being trained to please a user, while also providing a robust assessment of whether an output is good or safe. Bengio hopes to create a model that can monitor and improve existing offerings from leading AI groups, preventing them from acting against human interests.

“The worst-case scenario is human extinction,” he said. “If we build AIs that are smarter than us and are not aligned with us and compete with us, then we’re basically cooked.”

Bengio’s move to establish LawZero comes as OpenAI aims to move further away from its charitable roots by converting into a for-profit company. That push has provoked concerns from AI experts and triggered a lawsuit from co-founder Elon Musk who is attempting to block the transaction.

Critics argue that OpenAI was founded to ensure AI was developed for humanity’s benefit, and the new structure eliminates legal recourse if the company prioritises profit over this goal. OpenAI argues it needs to raise capital under a more conventional structure to compete in the sector, while its broader mission remains central.

Bengio said he did not have confidence that OpenAI would adhere to its mission stressing that non-profits do not have a “misaligned incentive that you do in the current way companies are structured.”

“To grow very fast, you need to convince people to invest a lot of money, and they want to see a return on their money. That’s how our market-based system works,” he added.