Fresh off unveiling MolmoAct and securing a U.S. government contract to build a new, fully open suite of advanced models, Ai2 is testing the boundaries of scientific discovery with the introduction of Asta, its new AI toolkit for scientific research. True to The Allen Institute for AI’s mission, this collection is built to give researchers transparency and verifiable evidence as they work to solve real-world problems.
“AI can be transformative for science, but only if it’s held to the same standards as science itself,” Ali Farhadi, Ai2’s chief executive, says in a statement. “With Asta, we’re not just building an assistant, but an ecosystem built on transparency, reproducibility, and scientific rigor.”
Subscribe to The AI Economy
At launch, this new offering has three main functions:
- Find relevant research papers through the use of its LLM-powered search experience (”Google Scholar on steroids”). It will reformulate queries, track down citations, and provide reasoning why a paper is relevant.
- Summarize literature by translating complex research questions into structured, comprehensive summaries, complete with clickable citations and an inline excerpt.
- Provide data analysis by transforming natural language questions into structured, reproducible analyses. It will parse through datasets, generate hypotheses, conduct statistical tests, and provide results. This feature is available in beta for select partners.
Ai2 hopes Asta will evolve to become a platform featuring advanced AI capabilities operating under a single interface, a one-stop intelligent shop for researchers to work faster and stay focused on the science. Experiment replication, hypothesis generation, and scientific programming are skills that Ai2 hints it could soon bring to Asta.
The Asta Framework
Asta isn’t a standalone tool. It’s made up of several components servicing the entire scientific AI development lifecycle.

It starts with the open-source namesake AI agent that helps scientists navigate literature, synthesize findings, and analyze data. There are some similarities with ScholarQA, another research agent that launched in January. The organization claims it’s fully transparent, provides needed citations, and is designed to integrate into the researcher’s real-world workflow.
Another component is AstaBench, a benchmark suite to help scientists gauge AI agent performance across complex, multi-step research tasks from literature comprehension to code execution and end-to-end discovery. To start, it will evaluate across 2,400 problems across 11 benchmarks. It will also have 16 leaderboards spanning agent performance across all benchmarking categories, four subcategories, and an overall ranking that includes performance and cost efficiency.
Ai2 boasts that AstaBench helps scientists figure out which AI agent best supports their needs. Developers are provided a reproducible and evidence-based testing environment from which to test and compare agents.
In its first evaluation of 57 agents across 22 different architectures, AstaBench determined only 18 handled all the benchmarks, but only with “modest scores.” Ai2’s domain-specific Asta v0—an “experimental agent” not a part of Asta’s production version today—outperformed all models with 53 percent. Although its score was 10 points higher than that of its closest competitor, React-gpt-5, it comes with a higher engineering and runtime cost. Even still, this initial testing reveals that many agents continue to struggle with complex tasks, underscoring the challenges developers face in creating purpose-built scientific agents.
The final element is Asta Resources, a developer toolkit that includes open-source agents, APIs, post-trained science language models, and access to the Scientific Corpus Tool. This is an MCP extension of Ai2’s Semantic Scholar API that provides agents access to over 225 million science papers from more than 80 million authors. When connected, Asta agents can execute sparse and dense full-text semantic searches across its open-access papers. All in all, this appears to be everything science developers may need to build trustworthy scientific agents.
“When building Asta, we focused on problems that we faced as researchers,” Ai2’s Chief Scientist, Dan Weld, testifies. “We needed AI tools that could really save us time by executing complex multi-step plans, explaining their thinking, and staying grounded in evidence. That’s what Asta delivers. It’s not just another assistant, but a collaborator designed to think like a scientist.”
Making the Case For Asta
Developing science use cases for AI is one of Ai2’s core focuses under Farhadi. Asta is another effort by the non-profit lab to help researchers parse through a growing body of literature, experiments, and data to iterate quickly and hopefully accelerate discoveries. Asta is built to provide a trusted and open standard to help scientists identify which language models are best suited to run their deep reasoning science demands.
“Our goal is to augment—never replace—scientists across diverse fields with AI collaborators they can trust,” Ai2 writes in a blog post. “Asta is also designed to advance the science of AI. While many AI agents perform well in constrained settings—where goals are predefined, options are limited, and outcomes are easy to verify, such as in customer support—scientific research demands a different kind of intelligence, one capable of building on prior knowledge while introducing ideas that are novel, verifiable, and impactful.”
Asta is already in use at 194 institutions worldwide, including the University of Chicago and Ai2’s hometown partner, the University of Washington. It’s reportedly already helping produce real-world discoveries, from identifying therapeutic targets to investigating new areas of inquiry.
Featured Image: Credit: Ai2
Subscribe to “The AI Economy”
Exploring AI’s impact on business, work, society, and technology.



Leave a Reply
You must be logged in to post a comment.