Ai2 is expanding its Asta toolkit with a new addition aimed at helping scientists navigate complex data. The release of Asta DataVoyager makes analyzing any amount of structured data more accessible, thanks to this open-source data analysis offering. All researchers need to do is upload a dataset and ask their questions using natural language. Then, Asta DataVoyager will return reliable and reproducible responses that are both scientifically sound and easy to use.
“AI can only accelerate science if it is as rigorous and transparent as science itself,” Ali Farhadi, the chief executive of the Allen Institute for AI, says in a release. “With Asta DataVoyager, we are giving researchers a trusted partner that puts powerful analytical tools directly in their hands while preserving the standards of reproducibility and security that the scientific community depends on.”
Subscribe to The AI Economy
This new agent promises to be intuitive, meaning anyone should be able to use it, no matter their data analysis skill level. It’s capable of accepting multiple data formats—CSV, Excel, JSON, HDF5, TSV, or Parquet—and will not only generate clear scientific answers, but also the code to reproduce the analysis, any visualizations to help interpret the findings, and a methods section containing any assumptions, reasoning, and statistical tests.
If needed, scientists can provide follow-up prompts and/or submit new data to refine results. Ai2 asserts that provenance will be maintained “much like a Python notebook,” meaning that every step of the analysis will be recorded. This will likely provide peace of mind to collaborators the researchers work with and the publications that will review their studies.
One of the first groups to trial Asta DataVoyager is the Cancer AI Alliance (CAIA). It’s an organization Ai2 is familiar with—the two, along with Google Cloud, are collaborating to develop AI to combat cancer. Founded in 2024, CAIA is a consortium of four renowned U.S. cancer centers—the Dana-Farber Cancer Institute, Fred Hutch Cancer Center, Memorial Sloan Kettering Cancer Center, and The Sidney Kimmel Comprehensive Cancer Center and Whiting School of Engineering at Johns Hopkins—and Amazon Web Services, Deloitte, Microsoft, and Nvidia.
CAIA is using the new AI agent to speed its research into multiple cancer types. Asta DataVoyager is analyzing de-identified patient records (all personally identifiable information has been stripped out) to generate cross-institutional insights. And because the data never leaves their home institutions, patient privacy is ensured.
Clinicians at the Paul G. Allen Research Center at Washington’s Swedish Cancer Institute are also piloting Ai2’s Asta DataVoyager. In this case, they don’t have the data-science resources on staff to process the massive volume of structured health data. So, they turned to this new AI agent to make data analysis accessible to the institute’s physicians.
Ai2 reveals that Asta DataVoyager is made for “sensitive, high-stakes scientific environments.” The organization claims that teams will retain complete control over their data and that several deployment options are available, including hosted portals, secure on-premise setups, and private cloud infrastructure. Uploaded datasets can be deleted at any time.
“We wanted to build a system that meets scientists where they are,” Bodhisattwa Prasad Majumder, one of Ai2’s research scientists, explains. “Instead of asking researchers to become programmers, Asta DataVoyager lets them ask questions about their data in their own words and receive answers they can trust, complete with code, visuals, and documentation. Our goal is to shorten the distance between a researcher’s idea and a reproducible scientific result.”
Featured Image: Credit: Ai2
Subscribe to “The AI Economy”
Exploring AI’s impact on business, work, society, and technology.


Leave a Reply
You must be logged in to post a comment.