New computational tool narrows the search for viable metal–organic frameworks

January 12, 2026

A research team led by Adji Bousso Dieng, Assistant Professor of Computer Science and affiliated faculty member at the Princeton Materials Institute, has developed a new machine learning–based tool that helps identify which metal–organic frameworks are likely to be stable and synthesizable. The approach addresses a major bottleneck in materials discovery by allowing researchers to rapidly screen large numbers of candidate structures before committing to costly simulations or experiments.

Niyongabo Rubungo, A., Fajardo-Rojas, F., Gómez-Gualdrón, D. A., & Dieng, A. B. (2025). Highly Accurate and Fast Prediction of MOF Free Energy via Machine Learning. Journal of the American Chemical Society, 147(52), 48035–48045. https://doi.org/10.1021/jacs.5c13960

Metal–organic frameworks, or MOFs, are crystalline materials composed of metal ions connected by organic linkers. Their defining feature is an internal network of pores that creates a very large surface area relative to volume. This property has made MOFs a focus of research for tasks such as gas storage, molecular separation, catalysis, and electrochemical energy storage. However, the same modularity that makes MOFs attractive also creates a major bottleneck: the number of theoretically possible combinations of metals and linkers is extremely large, far exceeding what can be tested experimentally or even simulated using conventional computational methods.

Adji Bousso Dieng, Assistant Professor of Computer Science and affiliated faculty member at the Princeton Materials Institute stated,

“We are lifting the problem where now you can compute the sequence representation itself very quickly, very cheaply. This technology allows researchers to focus resources on promising candidates for practical applications in carbon capture, energy storage, catalysis and gas separation.”

Traditionally, researchers rely on molecular simulations to evaluate whether a hypothetical MOF structure is stable enough to exist under real conditions. These simulations, which often calculate thermodynamic quantities such as free energy, are computationally expensive. Evaluating a single structure can take hours or days, making it impractical to explore large chemical design spaces.

The new work, led by Adji Bousso Dieng, assistant professor of computer science and an associated faculty member at the Princeton Materials Institute, addresses this challenge by reframing MOF screening as a prediction problem. Instead of running full molecular simulations for every candidate, the team trained a machine learning model to estimate the free energy of MOF structures directly.

A key step in the project was developing a way to represent MOFs in a form that a machine learning model could interpret. MOFs are complex, with repeating units and long-range order, which makes them difficult to encode using standard descriptors. The researchers converted the relevant physical and chemical features of each structure into a sequence-based representation, similar in spirit to how language models process text. This representation preserves information related to bonding, topology, and energetic contributions within the framework.

Using this method, the team generated representations for approximately one million hypothetical MOFs. They then trained a custom language model on a subset of structures with known free energy values. To make training feasible, the model was calibrated using a simpler property closely correlated with free energy, before being tested against a dataset of about 65,000 MOFs for which high-quality reference data were available. The resulting predictions matched the reference values with an accuracy of about 97 percent.

An important aspect of the work is its connection to experimental feasibility. Previous research by collaborators, including Diego Gómez-Gualdrón at the Colorado School of Mines, established a practical free energy threshold below which a MOF is considered stable enough to be synthesized in the laboratory. By predicting free energy values, the new tool can therefore provide a direct indication of whether a proposed structure is likely to be experimentally realizable.

From an engineering standpoint, this capability changes how MOF discovery workflows can be organized. Instead of using simulations as a primary filter, researchers can now apply the machine learning model as a fast screening step, narrowing millions of candidates down to a manageable set for detailed analysis and experimental validation. Predictions that once required hours of computation can now be produced in seconds.

The team is continuing to refine the approach by simplifying the sequence representations and reducing computational overhead for especially complex structures. They are also integrating search functionality into the system, allowing users to query the model for MOFs that meet specific stability criteria or application-driven constraints.

Beyond MOFs, the study illustrates a broader trend in materials engineering toward combining domain knowledge with data-driven models. By embedding physical meaning into machine-readable representations, machine learning tools can complement traditional simulations rather than replace them. In this case, the result is a practical way to navigate a design space that would otherwise be inaccessible, accelerating progress in areas such as carbon capture, gas separation, catalysis, and energy storage.

The work demonstrates that carefully designed machine learning models can play a central role in materials discovery, not by eliminating physics-based understanding, but by making it usable at scales relevant to modern engineering challenges.

Leave a Reply

Your email address will not be published.

Previous Story

Researchers Use Computational Drug Design to Reprogram Insulin Resistant Cells

Next Story

Catalyst behavior that could reduce emissions and stabilize supply of everyday materials

Privacy Preference Center