AI agents can help with literature reviews, handling data sets and writing code.Credit: sanjeri/Getty
Millions of people consult chatbots every day. But artificial intelligence (AI) advocates are betting that ‘AI agents’ are the application of this technology that will impact society the most.
Agentic AI involves using a large language model (LLM) to carry out multi-step tasks, by connecting it to external tools such as Internet browsers or coding suites. The hope is that AI assistants can be created that simplify real-world tasks. In science, some think that AI agents — perhaps even several working together — will not just save time, but also eventually run their own experiments and generate knowledge.
AI for research: the ultimate guide to choosing the right tool
But this dream is not yet a reality. Although access to AI agents is already being sold by technology firms, many such agents are either limited in scope or exist in beta versions that require significant human oversight. Because they are based on LLMs, which are, at heart, statistical prediction machines, they are prone to making mistakes known as hallucinations. In a trial earlier this year by Anthropic in San Francisco, California, to see whether its agent Claudius could run a vending-machine-based shop, the agent conjured up fake bank account details and sold some items at a loss.
Nature spoke to researchers developing, evaluating and using AI agents to find out how scientists can make use of the bots and mitigate the risks.
What is an AI agent?
Researchers already use automated tools, for example, citation managers that organize and format references, and workflow packages that process and analyse data. But AI agents are different. Rather than follow prescribed instructions for each task, agents use LLMs to make and refine plans on the fly for a variety of multi-step goals. Unlike lone LLMs, they also harness tools to take actions in the real world — for instance, to write and run code or navigate databases — with some interacting with each other and using working memory to remember user preferences and previous actions.
What can they do?
Streamlining everyday research tasks is one goal. “In my group, every PhD student now has their own AI agent that effectively serves as a research assistant,” says Marinka Zitnik, a researcher in biomedical informatics at Harvard University in Boston, Massachusetts. These home-made agents help Zitnik’s team to perform low-stakes tasks, such as curating data sets, turning text into tables and writing certain pieces of code, she says.
We need a new ethics for a world of AI agents
One appealing application of agents lies in using them to emulate the collaboration of several researchers with different expertise. An example is the AI ‘tumour board’ being developed by Microsoft. In this case, agents, each with access to different data sets and training, interact to mimic the deliberations of the multidisciplinary team that determines an individual treatment plan for a person with cancer. Because tumour boards are usually formed only for patients with the most complicated cases, using health-care agents to assist clinicians could allow personalized care to be provided for more people, says Ece Kumar, who leads the AI Frontiers laboratory at Microsoft Research, based in Redmond, Washington. (In a statement in May, Microsoft said that its health-care AI models were intended for research use and were not to be deployed in clinical settings “as-is”.)
Can AI agents help to make discoveries?
This is an idea that excites researchers, but the answer remains unclear. Google and many other firms and academic groups have developed ‘co-scientist’ agents, which generate hypotheses by looking for hidden insights in existing data. The co-scientist uses multiple agents to, for example, evolve and improve ideas, or test them against each other.
Zitnik and her colleagues are also exploiting an agent’s ability to access live data and ‘reason’ on the basis of that data, in drug discovery. In unpublished work, they used an AI agent to tap into and analyse data on clinical trials, on adverse effects and in regulatory documents, to look for drugs that have protective effects against diseases they were not prescribed for. They found, for example, that people with diabetes who were given dapagliflozin had a lower incidence of Alzheimer’s disease later in life than those who were not prescribed it. The team is also running in silico ‘clinical trials’ using electronic health records to test hypotheses, she says.
What kind of expertise do you need?
For simple uses such as literature reviews, agentic AI already exists as packages that “anybody can use”, says Doug Downey, an AI researcher at the Allen Institute for Artificial Intelligence (Ai2) in Seattle, Washington. Although more advanced systems require machine-learning expertise, some researchers are trying to democratize access. Zitnik and her colleagues are developing ToolUniverse, an open online environment that allows researchers to connect LLMs to commonly used tools in different scientific domains, using only natural language commands. This should “make AI agents more broadly accessible to other fields and scientists who do not write code”, she says.
How well do agents work?
The ultimate agent — one that can get anything done autonomously in a reliable way — is “almost an artificial general intelligence problem”, says Kumar. “We are far from having those agents.” But researchers are attempting to benchmark how well agents perform now.
How AI-powered science search engines can speed up your research