MIT scientists have released a strong, open-source AI model, called Boltz-1, that might significantly speed up biomedical research and drug development.
Developed by a team of researchers within the MIT Jameel Clinic for Machine Learning in Health, Boltz-1 is the primary fully open-source model that achieves state-of-the-art performance at the extent of AlphaFold3, the model from Google DeepMind that predicts the 3D structures of proteins and other biological molecules.
MIT graduate students Jeremy Wohlwend and Gabriele Corso were the lead developers of Boltz-1, together with MIT Jameel Clinic Research Affiliate Saro Passaro and MIT professors of electrical engineering and computer science Regina Barzilay and Tommi Jaakkola. Wohlwend and Corso presented the model at a Dec. 5 event at MIT’s Stata Center, where they said their ultimate goal is to foster global collaboration, speed up discoveries, and supply a strong platform for advancing biomolecular modeling.
“We hope for this to be a start line for the community,” Corso said. “There’s a reason we call it Boltz-1 and never Boltz. This isn’t the top of the road. We wish as much contribution from the community as we will get.”
Proteins play a necessary role in nearly all biological processes. A protein’s shape is closely connected with its function, so understanding a protein’s structure is critical for designing recent drugs or engineering recent proteins with specific functionalities. But due to extremely complex process by which a protein’s long chain of amino acids is folded right into a 3D structure, accurately predicting that structure has been a serious challenge for a long time.
DeepMind’s AlphaFold2, which earned Demis Hassabis and John Jumper the 2024 Nobel Prize in Chemistry, uses machine learning to rapidly predict 3D protein structures which might be so accurate they’re indistinguishable from those experimentally derived by scientists. This open-source model has been utilized by academic and business research teams around the globe, spurring many advancements in drug development.
AlphaFold3 improves upon its predecessors by incorporating a generative AI model, generally known as a diffusion model, which may higher handle the quantity of uncertainty involved in predicting extremely complex protein structures. Unlike AlphaFold2, nonetheless, AlphaFold3 isn’t fully open source, neither is it available for business use, which prompted criticism from the scientific community and kicked off a global race to construct a commercially available version of the model.
For his or her work on Boltz-1, the MIT researchers followed the identical initial approach as AlphaFold3, but after studying the underlying diffusion model, they explored potential improvements. They incorporated those who boosted the model’s accuracy essentially the most, similar to recent algorithms that improve prediction efficiency.
Together with the model itself, they open-sourced their entire pipeline for training and fine-tuning so other scientists can construct upon Boltz-1.
“I’m immensely happy with Jeremy, Gabriele, Saro, and the remainder of the Jameel Clinic team for making this release occur. This project took many days and nights of labor, with unwavering determination to get up to now. There are lots of exciting ideas for further improvements and we stay up for sharing them in the approaching months,” Barzilay says.
It took the MIT team 4 months of labor, and plenty of experiments, to develop Boltz-1. One among their biggest challenges was overcoming the paradox and heterogeneity contained within the Protein Data Bank, a group of all biomolecular structures that 1000’s of biologists have solved previously 70 years.
“I had loads of long nights wrestling with these data. Loads of it’s pure domain knowledge that one just has to amass. There are not any shortcuts,” Wohlwend says.
Ultimately, their experiments show that Boltz-1 attains the identical level of accuracy as AlphaFold3 on a various set of complex biomolecular structure predictions.
“What Jeremy, Gabriele, and Saro have achieved is nothing wanting remarkable. Their exertions and persistence on this project has made biomolecular structure prediction more accessible to the broader community and can revolutionize advancements in molecular sciences,” says Jaakkola.
The researchers plan to proceed improving the performance of Boltz-1 and reduce the period of time it takes to make predictions. In addition they invite researchers to try Boltz-1 on their GitHub repository and connect with fellow users of Boltz-1 on their Slack channel.
“We expect there continues to be many, a few years of labor to enhance these models. We’re very wanting to collaborate with others and see what the community does with this tool,” Wohlwend adds.
Mathai Mammen, CEO and president of Parabilis Medicines, calls Boltz-1 a “breakthrough” model. “By open sourcing this advance, the MIT Jameel Clinic and collaborators are democratizing access to cutting-edge structural biology tools,” he says. “This landmark effort will speed up the creation of life-changing medicines. Thanks to the Boltz-1 team for driving this profound breakthrough!”
“Boltz-1 will likely be enormously enabling, for my lab and the entire community,” adds Jonathan Weissman, an MIT professor of biology and member of the Whitehead Institute for Biomedical Engineering who was not involved within the study. “We’ll see a complete wave of discoveries made possible by democratizing this powerful tool.” Weissman adds that he anticipates that the open-source nature of Boltz-1 will result in an unlimited array of creative recent applications.
This work was also supported by a U.S. National Science Foundation Expeditions grant; the Jameel Clinic; the U.S. Defense Threat Reduction Agency Discovery of Medical Countermeasures Against Recent and Emerging (DOMANE) Threats program; and the MATCHMAKERS project supported by the Cancer Grand Challenges partnership financed by Cancer Research UK and the U.S. National Cancer Institute.