OpenAI's fix for hallucinations is simpler than you think

Hector Roqueta Rivero/Moment via Getty Images

Follow ZDNET: Add us as a preferred source on Google.

ZDNET’s key takeaways

OpenAI says AI hallucination stems from flawed evaluation methods.
Models are trained to guess rather than admit ignorance.
The company suggests revising how models are trained.

Even the biggest and most advanced generative AI models occasionally hallucinate, or generate inaccurate information presented as fact. Now, OpenAI claims to understand why — while offering a possible solution.

In a research paper published last week, a team of researchers from the company argued that hallucination stems not from the quality of a model’s training data, but rather from flawed evaluation incentives. These are widely used throughout the industry and reward guessing over the admission of uncertainty.

Also: Your favorite AI chatbot is full of lies

“Language models are optimized to be good test-takers, and guessing when uncertain improves test performance,” the authors write in the paper.

Models are trained to identify subtle mathematical patterns from an enormous corpus of training data, which they then use as a framework for generating responses to user queries. The current evaluation paradigm essentially uses a simple, binary grading metric, rewarding them for accurate responses and penalizing them for inaccurate ones. According to this method, admitting ignorance is judged as an inaccurate response, which pushes models toward generating what OpenAI describes as “overconfident, plausible falsehoods” — hallucination, in other words.

(Disclosure: Ziff Davis, ZDNET’s parent company, filed an April 2025 lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.)

If asked to state your birthday, for example, a model might take a wild guess rather than simply saying, “I don’t know.” It has a one-in-365 chance of being correct; not tremendously great odds, but better than just admitting ignorance — which, according to current evaluation metrics, would guarantee zero points for the model. Models are evaluated on their average performance across millions of outputs, exerting a subtle statistical pressure toward guesswork. If enough users ask the model to guess their birthday enough times, odds are it will generate the correct answer some tiny percentage of the time. Better to roll the dice and get those points than just admit ignorance and never win at all.

Also: DeepSeek may be about to shake up the AI world again – what we know

“Strategically guessing when uncertain improves accuracy but increases errors and hallucinations,” OpenAI wrote in an accompanying blog post about its findings.

Since this “accuracy-only” approach currently pervades the industry, determining which models dominate scoreboards, developers are incentivized to keep building models that prioritize guessing over admitting uncertainty, leading to more hallucinations.

How to fix hallucinations

The solution, according to OpenAI, is therefore to focus not on feeding models more accurate information, but to adjust the structure of how their performance is assessed.

Since a binary system of grading a model’s output as either right or wrong is supposedly fueling hallucination, the OpenAI researchers say that the AI industry must instead start rewarding models when they express uncertainty.

After all, truth does not exist in black-and-white in the real world, so why should AI be trained as if it does? Running a model through millions of examples on the proper arrangement of subjects, verbs, and predicates will make them more fluent in their use of natural language, but as any living human being knows, reality is open to interpretation. In order to live functionally in the world, we routinely have to say, “I don’t know.”

Also: Chatbots are distorting news – even for paid users

Similarly, the OpenAI researchers argue that models will continue to hallucinate so long as they’re rewarded for guessing when they should be admitting ignorance. “Simple modifications of mainstream evaluations can realign incentives, rewarding appropriate expressions of uncertainty rather than penalizing them,” they write in the new paper. “This can remove barriers to the suppression of hallucinations, and open the door to future work on nuanced language models with richer pragmatic competence.”

What's Hot

US, China hail progress in trade talks as Trump and Xi set to weigh deal | International Trade News

The sell line on this article about Emma Bunton may be the best the Guardian ever published

Barcelona’s De Jong slams Carvajal for making scene with Yamal

OpenAI’s Atlas is more about ChatGPT than the web

OpenAI’s new browser is a broadside shot at Google

OpenAI’s ‘embarrassing’ math | TechCrunch

Why world models are the next big thing in AI

Finally, an Android smartwatch that competes with my Garmin in battery life (but it’s cheaper)

A mini-CrowdStrike moment? Windows 11 update cripples dev environments

Client Challenge

China’s smartphone champion has triumphed where Apple failed

Latent Labs launches web-based AI model to democratize protein design

US, China hail progress in trade talks as Trump and Xi set to weigh deal | International Trade News

The sell line on this article about Emma Bunton may be the best the Guardian ever published

Barcelona’s De Jong slams Carvajal for making scene with Yamal

Popular Categories

Email : admin@zozoti.com

Company

Services

Phone : +2348022303189

Add Your Heading Text Here