Drug discovery increasingly depends on the ability to move from an idea to a workable synthesis route with minimal delay. Yet for many chemists, identifying reliable experimental procedures remains a slow and fragmented process, scattered across decades of publications and supplementary data. A research team led by Victor Batista, the John Gamble Kirkwood Professor of Chemistry at Yale University, has introduced a new platform designed to address this bottleneck by organizing chemical synthesis knowledge into an AI-assisted system called MOSAIC.
Li, H., Sarkar, S., Lu, W., Loftus, P. O., Qiu, T., Shee, Y., Cuomo, A. E., Webster, J.-P., Kelly, H. R., Manee, V., Sreekumar, S., Buono, F. G., Crabtree, R. H., Newhouse, T. R., & Batista, V. S. (2026). Collective intelligence for AI-assisted chemical synthesis. Nature. https://doi.org/10.1038/s41586-026-10131-4
The work, developed at Yale in collaboration with researchers from Boehringer Ingelheim Pharmaceuticals in Connecticut, focuses on a familiar but often overlooked aspect of chemistry: protocols. While reaction mechanisms and predictive models receive much of the attention, real-world synthesis still depends on practical details such as temperatures, solvents, catalysts, and workup steps. These details are abundant in the literature, but they are difficult to access in a unified and usable form.
Victor Batista Professor of Chemistry at Yale University stated,
“Chemistry has evolved from books to databases, and now to AI-guided navigation. At a high level, MOSAIC functions like a smart cookbook for new recipes and Google Maps for navigating chemical synthesis. It helps chemists turn vast knowledge into detailed, reproducible procedures for synthesis with an indication of how likely they are to work.”
MOSAIC, short for a collective intelligence framework for chemical synthesis, is designed to compile this dispersed knowledge and translate it into step-by-step experimental procedures. Rather than relying on a single large language model, the platform draws on thousands of specialized AI “experts,” each trained on a distinct area of chemical practice. The system selects and combines relevant expertise depending on the synthesis problem at hand, producing protocols tailored to specific targets, including molecules that have not previously been reported.
Batista and his colleagues describe the approach as a way to turn information overload into practical guidance. Chemistry, they note, has accumulated millions of documented reactions, yet much of this information remains underused because it is difficult to search, compare, and adapt. By structuring knowledge around protocol-level decisions, MOSAIC aims to bridge the gap between prediction and execution.
The project also reflects a broader shift in how AI is applied in laboratory science. Many existing tools focus on predicting reaction outcomes or molecular properties. MOSAIC instead emphasizes procedural reasoning, treating synthesis as a sequence of decisions informed by experience. Timothy Newhouse, a professor of chemistry at Yale and co-corresponding author on the study, has compared the process to following a recipe, where success depends as much on technique as on ingredients.
The platform was tested across a wide range of chemical spaces, including pharmaceutical compounds, catalysts, materials, agrochemicals, and cosmetic ingredients. In reported demonstrations, MOSAIC-generated protocols enabled the successful synthesis of more than 35 compounds that had not previously appeared in the literature. The system also provides uncertainty estimates that indicate how closely a proposed synthesis aligns with existing expertise, allowing chemists to judge which suggestions are most reliable.
First authors Haote Li and Sumon Sarkar emphasize that this feature is particularly important for laboratory planning. Rather than presenting AI-generated procedures as definitive answers, MOSAIC frames them as informed recommendations, with transparent signals about confidence and risk. This design reflects a growing recognition that AI tools in chemistry must support, rather than replace, experimental judgment.
The decision to make MOSAIC fully open-source further signals the team’s intent for broad adoption. By remaining compatible with future models and datasets, the framework is positioned as an evolving resource rather than a fixed product. This openness also aligns with trends in academic and industrial chemistry toward shared infrastructure, where tools gain value through community use and feedback.
From an engineering perspective, the implications extend beyond faster molecule design. Standardized, AI-assisted protocols could improve reproducibility, reduce redundant experimentation, and lower barriers for smaller laboratories entering complex areas of synthesis. For industrial partners, the ability to rapidly assess feasible synthetic routes could shorten development cycles and reduce early-stage costs.
As chemical research continues to scale in both volume and complexity, systems like MOSAIC highlight a shift toward navigation rather than accumulation of knowledge. Instead of asking chemists to read more papers, the platform seeks to guide them through what is already known, translating collective experience into actionable steps. Whether this approach becomes a standard part of the laboratory workflow will depend on adoption and validation, but it reflects a practical response to one of modern chemistry’s most persistent challenges.

Adrian graduated with a Masters Degree (1st Class Honours) in Chemical Engineering from Chester University along with Harris. His master’s research aimed to develop a standardadised clean water oxygenation transfer procedure to test bubble diffusers that are currently used in the wastewater industry commercial market. He has also undergone placments in both US and China primarely focused within the R&D department and is an associate member of the Institute of Chemical Engineers (IChemE).

