When the pioneering artificial intelligence (AI) firm Google DeepMind announced almost two years ago that it had used a deep-learning AI technique to discover 2.2 million new crystalline materials, it seemed to herald a thrilling new era of accelerated materials research1.
Composed of elements from across the periodic table, the materials in the trove included 52,000 simulations of layered compounds similar to the wonder material graphene, 528 potential lithium-ion conductors that might be used to improve rechargeable batteries, and much more.
Researchers built an ‘AI Scientist’ — what can it do?
But the effort — and similar ones that followed, involving the technology firms Microsoft and Meta — quickly came under fire from researchers who say that some of the compounds that the AI systems dreamed up were unoriginal, unfeasible or not fit for purpose.
“We found quite a lot of things that were ridiculous,” says materials scientist Anthony Cheetham at the University of California, Santa Barbara (UCSB), after looking through DeepMind’s list of hypothetical crystals. He and his UCSB colleague Ram Seshadri note that more than 18,000 of the compounds predicted by the project include extremely scarce radioactive elements such as promethium and protactinium, which they doubt could ever be useful materials2. “It’s one thing to discover a compound, and a totally different thing to discover a new functional material,” says Cheetham.
Work involving Meta suggested more than 100 materials that might capture carbon dioxide directly from the air and help to reduce global warming3. However, these provoked similar criticism. Computational chemist Berend Smit at the Swiss Federal Institute of Technology in Lausanne (EPFL) says that the candidates are not viable for that purpose. He suggests that the AI tool used in the work seemed so exciting that the authors were “a little bit blinded to the reality”.
So will AI really revolutionize materials discovery, or is it drowning in its own hype? Since the initial criticisms, materials scientists have examined the results from these firms in more detail to assess the true potential of AI. The teams behind the work have responded, in some cases toning down the initial claims or proposing workarounds. Many researchers conclude that AI holds great promise in materials science, but that more collaboration with experimental chemists — and some humility about the current limitations of these systems — will be crucial for realizing their full potential.
Crystal balls
From the mixture of copper and tin that sparked the Bronze Age to the invention of stainless steel, the discovery of materials has driven innovation throughout human history. In the past decade, the use of AI in materials science has taken off (see ‘AI’s material growth’). Many of the latest efforts, which use AI to speed up the discovery of materials, focus on crystalline inorganic solids, a subset of chemical compounds that are essential components of countless technologies, from semiconductors to lasers.
Source: Scopus
The properties of crystalline inorganic solids are determined not only by the atoms they contain, but also by how those atoms are arranged in repeating patterns. So when scientists plan to make new inorganic crystals, they don’t just come up with fresh combinations of atoms — they often try to predict what structure those atoms might adopt.
Before the advent of AI, researchers used more-conventional computational methods to do so. One of the most powerful methods is density functional theory (DFT), a way to approximate the complicated mathematics that describes how electrons behave in materials. For a hypothetical inorganic compound, this can reveal which structure is the most stable — and therefore most likely to exist — as well as the compound’s properties.
Scientists have used DFT to predict new materials that have spectacular properties and that went on to be made in the lab — including super-strong magnets and ‘superconductors’ that transmit electricity without resistance but, unlike most superconducting materials, don’t require extremely cold temperatures4. The Materials Project, at Lawrence Berkeley National Laboratory (LBNL) in California, has recorded DFT-calculated structures for roughly 200,000 crystals in an open-access database5.
But DFT is computationally hungry. Most academic labs can access enough computing power to run DFT calculations on a handful of compounds, but surveying millions at a time would be unfeasibly expensive.
Google AI and robots join forces to build new materials
That’s where the high-profile AI efforts come in. In the case of DeepMind, rather than relying solely on intensive DFT calculations, the London firm fed a machine-learning algorithm the results of calculations that had already been recorded by, for example, the Materials Project. The algorithm, which the team called graph networks for materials exploration, or GNoME, learnt from these examples how to predict the stability of randomly generated crystal structures, and did so much faster than conventional DFT does. The system then checked the most promising of these predictions using DFT and poured the results back into GNoME to improve its performance. That ultimately enabled GNoME to dream up an enormous collection of compounds that it expects to be stable1.
“I’m completely convinced that if you’re not using these kinds of method within the next couple of years, you’ll be behind,” says materials scientist Kristin Persson at LBNL and the University of California, Berkeley, who is the director of the Materials Project.
In another effort involving DeepMind researchers, AI has also been used to help synthesize materials. Persson co-wrote a paper6, published alongside the GNoMe results, which described the robotic ‘A-Lab’. The system was fed tens of thousands of published papers describing how to make various inorganic compounds. It learnt to devise recipes for synthesizing a list of target compounds that had not been made before but for which structures had been predicted by DFT and logged by the Materials Project. A-Lab then deployed physical robots to make those compounds and analyse the products to check that they match the targets, tweaking the recipes if necessary.
Shortly after the GNoME and A-Lab teams published their papers, Microsoft unveiled its own AI tool for materials discovery7. Like GNoME, MatterGen is a machine-learning model that has been trained to generate stable crystal structures. But MatterGen was designed to be more targeted than GNoME: it is able to suggest hypothetical materials that have specific properties. “You can directly generate the crystals that satisfy your design criteria,” says Tian Xie, a researcher at Microsoft Research AI for Science in Cambridge, UK, who led the effort. “This is much more efficient than using brute force to create millions of candidates.”
The project involving Meta is even more targeted. The firm’s Fundamental AI Research team worked with scientists at the Georgia Institute of Technology, Atlanta, to identify porous materials called metal–organic frameworks (MOFs) that might efficiently suck CO2 directly from the air.
The researchers used DFT to calculate the ability of more than 8,000 experimentally reported MOFs to bind to CO2. Then, they used those results to train an AI model to perform the same task, and showed that it offered similar accuracy and was much faster than DFT. In a May 2024 paper3, the researchers predicted that more than 100 of these MOFs contained regions that would strongly bind to CO2, offering proof of principle that AI tools could accelerate the development of MOFs for direct air capture.
Out of order
But all of these forays have engendered controversy. When solid-state chemist Robert Palgrave at University College London looked at the A-Lab results, he quickly concluded that the project had mischaracterized some of the 41 inorganic compounds that it claimed to have produced, and in some cases had synthesized materials that had already been made. Palgrave has since produced a more extensive critique of A-Lab’s work, in collaboration with Leslie Schoop at Princeton University in New Jersey and others, in which they detail shortfalls in the characterization of the products and conclude that no new materials were discovered in the A-Lab paper8.
They also identify a more fundamental problem, rooted in the limitations of the DFT technique that supplied A-Lab with its target structures. Palgrave notes that the DFT method usually predicts highly ordered crystal structures, which might be stable only if temperatures could plunge to the limit of absolute zero (–273 °C). But in reality, the arrangements of atoms in crystalline materials are often much messier. Although many of the ordered DFT structures that A-Lab was told to make seemed new, they had, in fact, been made before as disordered structures — and it was those known, disordered forms that A-Lab eventually made, says Palgrave.
Gerbrand Ceder, who is at LBNL and the University of California, Berkeley, and co-led the A-Lab work, disagrees. He says that a detailed reanalysis by researchers showed that A-Lab’s characterizations were reliable. “A-Lab made the compounds that it claimed it made, and for which it had no synthesis information,” he says. “Making disordered versions of predicted ordered compounds is typically characterized as a success, and the standard in comparing theory predictions and experiments,” he adds.

The A-Lab project deployed robots to make new compounds using recipes devised by AI.Credit: Marilyn Sargent/Berkeley Lab
The disorder issue also affects AI-based DFT surrogates such as GNoME, says Johannes Margraf, a computational chemist at the University of Bayreuth in Germany. Together with colleagues, he trained a machine-learning system on crystal structures that have been determined by experimental measurement, rather than DFT. The model learnt to forecast whether a compound is likely to be disordered owing to similar elements swapping places in a crystal9. It suggested that of about 380,000 stable compounds that the DeepMind team highlighted as promising targets for synthesis — all with apparently ordered crystal structures — 80–84% would be disordered in real life.
This finding implies that many of GNoME’s suggestions are unlikely to be realized in the lab, at least in their ordered forms, and might have different properties from those predicted. AI models trained on DFT data can also miss potentially useful properties that arise from a structure’s disorder, which the models don’t account for, says Margraf. “If you ignore the presence of disorder, you can have both false negatives and false positives,” he says. “It’s not a small detail.”
Robot chemist sparks row with claim it created new materials
Materials scientist Ekin Dogus Cubuk, one of the lead authors of the GNoME paper1 who has now left DeepMind to found the start-up company Periodic Labs in California, accepts that many of the ordered structures predicted by GNoME will probably turn out to be disordered. He says that the tool’s main purpose is to provide a signpost towards promising compounds that require further investigation. “It’s not like somebody can just simulate a material and it just becomes an incredible product.”
Some, however, were riled by DeepMind’s suggestion in their paper1 that they had achieved “an order-of-magnitude expansion in stable materials known to humanity”, which sounded too good to be true. “It was kind of a red rag to a bull,” says Cheetham. “Our hackles were raised.”
Machine-learning engineer Jonathan Godwin, who worked for DeepMind before leaving in 2022 to found his own AI-materials firm, Orbital Materials in London, agrees: “It’s pretty implausible to say that 2.2 million things you haven’t synthesized are new materials.”
A DeepMind spokesperson points out that more than 700 of the compounds GNoME predicted were independently made by other researchers, and that GNoME structures helped to guide the synthesis of several previously unknown caesium-based compounds that might be of interest for applications such as optoelectronics and energy storage10.