Artificial intelligence models are starting to succeed in science. In the past two years, they have demonstrated that they can analyse data, design experiments and even come up with new hypotheses. The pace of progress has some researchers convinced that artificial intelligence (AI) could compete with science’s greatest minds in the next few decades.
In 2016, Hiroaki Kitano, a biologist and chief executive at Sony AI, challenged researchers to accomplish just that: to develop an AI system so advanced that it could make a discovery worthy of a Nobel prize. Calling it the Nobel Turing Challenge, Kitano presented the endeavour as the grand challenge for AI in science. A machine wins if it can achieve a discovery on a par with top-level human research.
That’s not something current models can do. But by 2050, the Nobel Turing Challenge envisions an AI system that, without human intervention, combines the skills of hypothesis generation, experimental planning and data analysis to make a breakthrough worthy of a Nobel prize.
On supporting science journalism
If you’re enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.
It might not even take until 2050. Ross King, a chemical-engineering researcher at the University of Cambridge, UK, and an organizer of the challenge, thinks such an ‘AI scientist’ might rise to laureate status even sooner. “I think it’s almost certain that AI systems will get good enough to win Nobel prizes,” he says. “The question is if it will take 50 years or 10.”
But many researchers don’t see how current AI systems, which are trained to generate strings of words and ideas on the basis of humankind’s existing pool of knowledge, could contribute fresh insights. Accomplishing such a feat might demand drastic changes in how researchers develop AI and what AI funding goes towards. “If tomorrow, you saw a government programme invest a billion dollars in fundamental research, I think it would advance much faster,” says Yolanda Gil, an AI researcher at the University of Southern California in Los Angeles.
Others warn that there are looming risks to introducing AI into the research pipeline.
Prize-worthy discoveries
The Nobel prizes were created to honour those who “have conferred the greatest benefit” to humankind, as its namesake, Alfred Nobel, wrote in his will. For the science prizes, Bengt Nordén, a chemist and former chair of the Nobel Committee for Chemistry, considers three criteria: a Nobel discovery must be useful, be rich with impact and open a door to further scientific understanding, he says.
Although only living people, organizations and institutions are currently eligible for the prizes, AI has had previous encounters with the Nobel committee. In 2024, the Nobel Prize in Physics went to machine-learning pioneers who laid the groundwork for artificial neural networks. That same year, half of the chemistry prize recognized the researchers behind AlphaFold, an AI system from Google DeepMind in London that predicts the 3D structures of proteins from their amino-acid sequence. But these awards were for the scientific strides behind AI systems — not for ones made by AI.
Demis Hassabis (left) and John Jumper (middle) won a Nobel prize for the AI model AlphaFold.
Jonathan Nackstrand/AFP via Getty
For an AI scientist to claim its own discovery, the research would need to be performed “fully or highly autonomously”, according to the Nobel Turing Challenge. The AI scientist would need to oversee the scientific process from beginning to end, deciding on questions to answer, experiments to run and data to analyse, according to Gil.
Gil says that she has already seen AI tools assisting scientists in almost every step of the discovery process, which “makes the field very exciting”. Researchers have demonstrated that AI can help to decode the speech of animals, hypothesize on the origins of life in the Universe and predict when spiralling stars might collide. It can forecast lethal dust storms and help to optimize the assembly of future quantum computers.
AI is also beginning to perform experiments by itself. Gabe Gomes, a chemist at Carnegie Mellon University in Pittsburgh, Pennsylvania, and his colleagues designed a system called Coscientist that relies on large language models (LLMs), the kind behind ChatGPT and similar systems, to plan and execute complex chemical reactions using robotic laboratory equipment. And an unreleased version of Coscientist can do computational chemistry with remarkable speed, says Gomes.
One of Gomes’s students once complained that the software took half an hour to work out a transition state for a reaction. “The problem took me over a year as a graduate student,” he says.
The Tokyo-based company Sakana AI is using LLMs in an attempt to automate machine-learning research. At the same time, researchers at Google and elsewhere are exploring how chatbots might work in teams to generate scientific ideas.
Most scientists who are using AI turn to it as an assistant or collaborator of sorts, often appointed to specific tasks. This is the first of three waves of AI in science, says Sam Rodriques, chief executive of FutureHouse — a research lab in San Francisco, California, that debuted an LLM designed to do chemistry tasks earlier this year. It and other ‘reasoning models’ learn to mimic step-wise logical thought, using a trial-and-error process that involves training on correct examples.
The existing models are helpful collaborators that can make predictions on the basis of data, and accelerate otherwise painstaking sorts of computation. But they tend to need a human in the loop during at least one stage.
Next, says Rodriques, AI will get better at developing and evaluating its own hypotheses by searching through literature and analysing data. James Zou, a biomedical data scientist at Stanford University in California, has begun moving into this realm. He and his colleagues recently showed that a system built on LLMs can scour biological data to find insights that researchers miss. For instance, when given a published paper and a data set of RNA sequences associated with it, the system found that certain immune cells in individuals with COVID-19 are more likely to swell up as they die, an idea that hadn’t been explored by the paper’s authors. It’s showing “that the AI agent is beginning to autonomously find new things,” Zou says.
He’s also helping to organize a virtual gathering called Agents4Science later this month, which he describes as the first AI-only scientific conference. All papers will be written and reviewed by AI agents, alongside human collaborators. And the one-day meeting will include invited talks and panel discussions (from humans) on the future of AI-generated research. Zou says he hopes that the meeting will help researchers to assess how capable AI is at doing and reviewing innovative research.
There are known challenges to such efforts, including the hallucinations that often plague LLMs, Zou says. But he says these issues could be mostly remedied with human feedback.
Rodriques says that the final stage of AI in science, and what FutureHouse is aiming for, is models that can ask their own questions and design and perform their own experiments — no human necessary. He sees this as inevitable, and says that AI could make a discovery worthy of a Nobel “by 2030 at the latest”.
The most promising areas for a breakthrough — by an AI scientist or otherwise — are in materials science or in treating diseases such as Parkinson’s or Alzheimer’s, he says, because these are areas with big open challenges and an unmet need.
Thinking about thinking
Many researchers are wary of such claims, seeing much larger hurdles. Doug Downey, a researcher at the Allen Institute for AI in Seattle, Washington, says he and his colleagues have found that their LLM agents fall flat when attempting to complete a research project from beginning to end. In one study of 57 AI agents, the team found that although the agents can fully complete specific science-related tasks about 70% of the time, that figure drops to just 1% when they attempt to generate an idea, plan and execute an experiment and analyse data for a full report (see go.nature.com/4ntxs6q). “End-to-end automated scientific discovery remains a formidable challenge,” Downey and the other authors write.
Although AI seems to have a lot of potential to advance science, it isn’t without limitations, says Downey. “I think it’s not clear how long it will take to overcome that.”
Even when today’s AI systems make sound predictions in a certain subfield, they don’t necessarily learn the larger underlying principles. One recent study, for instance, found that although an AI model could predict how a planet orbits a star, it couldn’t replicate the fundamental laws of physics that govern these bodies. It wasn’t learning a scientific principle so much as mimicking the results of that principle. In another study, an AI tool couldn’t conjure an accurate map of New York City’s streets, despite learning how to navigate through the city.
Subbarao Kambhampati, a computer scientist at Arizona State University in Tempe, says such pitfalls demonstrate how the lived experience of a human researcher is important for working out basic scientific principles. By contrast, AI systems experience the world only vicariously through the data sets that they are fed. Some researchers are exploring a melding of AI and robots that would give these systems more experience navigating the world.
A lack of real-world experience will make it difficult for AI models to pose fresh, creative questions and offer new insights into the human world, says Kambhampati. “I’m very supportive of claims that AI can accelerate science,” he says. But “to say that you don’t need human scientists and that this machine will just make some Nobel-worthy discovery” sounds like nothing more than hype.
For Gil, developing an AI scientist capable of a Nobel-worthy discovery would require investing more effort in AI tools with a wider range of capabilities, including meta-reasoning. Researchers will need to find ways to imbue AI with the ability to evaluate and adjust its own reasoning processes — to think about its own thinking. That shift could enable models to weigh up which types of experiment will produce the best results and to revise their scientific theories on the basis of new findings.
Gil has long worked on fundamental research that could grant AI such abilities, but she says that LLMs have taken over the spotlight. If that continues, she expects Nobel-worthy discoveries to be a distant prospect. “There are so many exciting results that you can get with generative AI techniques,” says Gil. “But there’s a lot of other areas to pay attention to.”
King agrees that there are obstacles ahead. LLMs don’t understand the human world well, or what they’re contributing to it, he says: “It doesn’t even know what it’s doing is science.”
Many discussions at meetings held by the Nobel Turing Challenge focus on what advances AI has yet to make and how it can get there. Does an AI scientist need to achieve artificial general intelligence, for instance, being as knowledgeable and adaptable as a human? Will an AI scientist behave like a human scientist, or will the path to discovery differ? What are the legal and ethical implications of AI-automated discovery? And how might a prize for AI scientists be funded?
Knowing what AI can achieve might come only with time. “The only way to get these answers is to test them — like we do with any hypothesis,” says Gil.
Other researchers wonder whether the scientific community should be pushing for such a discovery at all. In a 2024 article, Lisa Messeri, an anthropologist at Yale University in New Haven, Connecticut, and Molly Crockett, a psychologist at Princeton University in New Jersey, argue that over-reliance on AI in science has already begun to introduce more errors. They also note that AI could crowd out alternative approaches and reduce innovation, with scientists beginning to “produce more but understand less”.
It’s possible that automated discovery could come with serious downsides for science — and scientists. AI is performing tasks that decrease opportunities for junior scientists, who might never gain the necessary skills to earn their own Nobel prizes down the line, Messeri says. “While this isn’t a zero-sum game, given the current shrinking of research and university budgets, we are at a concerning moment for evaluating the pros and cons of this future,” she says.
This article is reproduced with permission and was first published on October 6, 2025.