Machine-learning models can speed up the invention of recent materials by making predictions and suggesting experiments. But most models today only consider just a few specific kinds of data or variables. Compare that with human scientists, who work in a collaborative environment and consider experimental results, the broader scientific literature, imaging and structural evaluation, personal experience or intuition, and input from colleagues and peer reviewers.
Now, MIT researchers have developed a way for optimizing materials recipes and planning experiments that comes with information from diverse sources like insights from the literature, chemical compositions, microstructural images, and more. The approach is an element of a brand new platform, named Copilot for Real-world Experimental Scientists (CRESt), that also uses robotic equipment for high-throughput materials testing, the outcomes of that are fed back into large multimodal models to further optimize materials recipes.
Human researchers can converse with the system in natural language, with no coding required, and the system makes its own observations and hypotheses along the best way. Cameras and visual language models also allow the system to observe experiments, detect issues, and suggest corrections.
“In the sphere of AI for science, the bottom line is designing recent experiments,” says Ju Li, School of Engineering Carl Richard Soderberg Professor of Power Engineering. “We use multimodal feedback — for instance information from previous literature on how palladium behaved in fuel cells at this temperature, and human feedback — to enrich experimental data and design recent experiments. We also use robots to synthesize and characterize the fabric’s structure and to check performance.”
The system is described in a paper published in . The researchers used CRESt to explore greater than 900 chemistries and conduct 3,500 electrochemical tests, resulting in the invention of a catalyst material that delivered record power density in a fuel cell that runs on formate salt to provide electricity.
Joining Li on the paper as first authors are PhD student Zhen Zhang, Zhichu Ren PhD ’24, PhD student Chia-Wei Hsu, and postdoc Weibin Chen. Their coauthors are MIT Assistant Professor Iwnetim Abate; Associate Professor Pulkit Agrawal; JR East Professor of Engineering Yang Shao-Horn; MIT.nano researcher Aubrey Penn; Zhang-Wei Hong PhD ’25, Hongbin Xu PhD ’25; Daniel Zheng PhD ’25; MIT graduate students Shuhan Miao and Hugh Smith; MIT postdocs Yimeng Huang, Weiyin Chen, Yungsheng Tian, Yifan Gao, and Yaoshen Niu; former MIT postdoc Sipei Li; and collaborators including Chi-Feng Lee, Yu-Cheng Shao, Hsiao-Tsu Wang, and Ying-Rui Lu.
A better system
Materials science experiments could be time-consuming and expensive. They require researchers to fastidiously design workflows, make recent material, and run a series of tests and evaluation to know what happened. Those results are then used to come to a decision tips on how to improve the fabric.
To enhance the method, some researchers have turned to a machine-learning strategy often called lively learning to make efficient use of previous experimental data points and explore or exploit those data. When paired with a statistical technique often called Bayesian optimization (BO), lively learning has helped researchers discover recent materials for things like batteries and advanced semiconductors.
“Bayesian optimization is like Netflix recommending the following movie to observe based in your viewing history, except as a substitute it recommends the following experiment to do,” Li explains. “But basic Bayesian optimization is just too simplistic. It uses a boxed-in design space, so if I say I’m going to make use of platinum, palladium, and iron, it only changes the ratio of those elements on this small space. But real materials have rather a lot more dependencies, and BO often gets lost.”
Most lively learning approaches also depend on single data streams that don’t capture every little thing that goes on in an experiment. To equip computational systems with more human-like knowledge, while still making the most of the speed and control of automated systems, Li and his collaborators built CRESt.
CRESt’s robotic equipment features a liquid-handling robot, a carbothermal shock system to rapidly synthesize materials, an automatic electrochemical workstation for testing, characterization equipment including automated electron microscopy and optical microscopy, and auxiliary devices reminiscent of pumps and gas valves, which can be remotely controlled. Many processing parameters can be tuned.
With the user interface, researchers can chat with CRESt and tell it to make use of lively learning to seek out promising materials recipes for various projects. CRESt can include as much as 20 precursor molecules and substrates into its recipe. To guide material designs, CRESt’s models search through scientific papers for descriptions of elements or precursor molecules that is perhaps useful. When human researchers tell CRESt to pursue recent recipes, it kicks off a robotic symphony of sample preparation, characterization, and testing. The researcher may ask CRESt to perform image evaluation from scanning electron microscopy imaging, X-ray diffraction, and other sources.
Information from those processes is used to coach the lively learning models, which use each literature knowledge and current experimental results to suggest further experiments and speed up materials discovery.
“For every recipe we use previous literature text or databases, and it creates these huge representations of each recipe based on the previous knowledge base before even doing the experiment,” says Li. “We perform principal component evaluation in this data embedding space to get a reduced search space that captures many of the performance variability. Then we use Bayesian optimization on this reduced space to design the brand new experiment. After the brand new experiment, we feed newly acquired multimodal experimental data and human feedback right into a large language model to enhance the knowledgebase and redefine the reduced search space, which provides us a giant boost in lively learning efficiency.”
Materials science experiments may face reproducibility challenges. To handle the issue, CRESt monitors its experiments with cameras, on the lookout for potential problems and suggesting solutions via text and voice to human researchers.
The researchers used CRESt to develop an electrode material for a complicated style of high-density fuel cell often called a direct formate fuel cell. After exploring greater than 900 chemistries over three months, CRESt discovered a catalyst material created from eight elements that achieved a 9.3-fold improvement in power density per dollar over pure palladium, an expensive precious metal. In further tests, CRESTs material was used to deliver a record power density to a working direct formate fuel cell regardless that the cell contained just one-fourth of the dear metals of previous devices.
The outcomes show the potential for CRESt to seek out solutions to real-world energy problems which have plagued the materials science and engineering community for many years.
“A big challenge for fuel-cell catalysts is the usage of precious metal,” says Zhang. “For fuel cells, researchers have used various precious metals like palladium and platinum. We used a multielement catalyst that also incorporates many other low-cost elements to create the optimal coordination environment for catalytic activity and resistance to poisoning species reminiscent of carbon monoxide and adsorbed hydrogen atom. People have been searching low-cost options for a few years. This method greatly accelerated our seek for these catalysts.”
A helpful assistant
Early on, poor reproducibility emerged as a significant problem that limited the researchers’ ability to perform their recent lively learning technique on experimental datasets. Material properties could be influenced by the best way the precursors are mixed and processed, and any variety of problems can subtly alter experimental conditions, requiring careful inspection to correct.
To partially automate the method, the researchers coupled computer vision and vision language models with domain knowledge from the scientific literature, which allowed the system to hypothesize sources of irreproducibility and propose solutions. For instance, the models can notice when there’s a millimeter-sized deviation in a sample’s shape or when a pipette moves something misplaced. The researchers incorporated a number of the model’s suggestions, resulting in improved consistency, suggesting the models already make good experimental assistants.
The researchers noted that humans still performed many of the debugging of their experiments.
“CREST is an assistant, not a alternative, for human researchers,” Li says. “Human researchers are still indispensable. In truth, we use natural language so the system can explain what it’s doing and present observations and hypotheses. But this can be a step toward more flexible, self-driving labs.”