Helping scientists run complex data analyses without writing code

-

As costs for diagnostic and sequencing technologies have plummeted lately, researchers have collected an unprecedented amount of information around disease and biology. Unfortunately, scientists hoping to go from data to latest cures often require help from someone with experience in software engineering.

Now, Watershed Bio helps scientists and bioinformaticians run experiments and get insights with a platform that lets users analyze complex datasets no matter their computational skills. The cloud-based platform provides workflow templates and a customizable interface to assist users explore and share data of every kind, including whole-genome sequencing, transcriptomics, proteomics, metabolomics, high-content imaging, protein folding, and more.

“Scientists need to learn in regards to the software and data science parts of the sector, but they don’t need to develop into software engineers writing code just to grasp their data,” co-founder and CEO Jonathan Wang ’13, SM ’15 says. “With Watershed, they don’t need to.”

Watershed is getting used by large and small research teams across industry and academia to drive discovery and decision-making. When latest advanced analytic techniques are described in scientific journals, they might be added to Watershed’s platform immediately as templates, making cutting-edge tools more accessible and collaborative for researchers of all backgrounds.

“The information in biology is growing exponentially, and the sequencing technologies generating this data are only convalescing and cheaper,” Wang says. “Coming from MIT, this issue was right in my wheelhouse: It’s a troublesome technical problem. It’s also a meaningful problem because these individuals are working to treat diseases. They know all this data has value, but they struggle to make use of it. We would like to assist them unlock more insights faster.”

No code discovery

Wang expected to major in biology at MIT, but he quickly got excited by the probabilities of constructing solutions that scaled to hundreds of thousands of individuals with computer science. He ended up earning each his bachelor’s and master’s degrees from the Department of Electrical Engineering and Computer Science (EECS). Wang also interned at a biology lab at MIT, where he was surprised how slow and labor-intensive experiments were.

“I saw the difference between biology and computer science, where you had these dynamic environments [in computer science] that allow you get feedback immediately,” Wang says. “Whilst a single person writing code, you could have a lot at your fingertips to play with.”

While working on machine learning and high-performance computing at MIT, Wang also co-founded a high frequency trading firm with some classmates. His team hired researchers with PhD backgrounds in areas like math and physics to develop latest trading strategies, but they quickly saw a bottleneck of their process.

“Things were moving slowly since the researchers were used to constructing prototypes,” Wang says. “These were small approximations of models they might run locally on their machines. To place those approaches into production, they needed engineers to make them work in a high-throughput way on a computing cluster. However the engineers didn’t understand the character of the research, so there was plenty of forwards and backwards. It meant ideas you thought might have been implemented in a day took weeks.”

To unravel the issue, Wang’s team developed a software layer that made constructing production-ready models as easy as constructing prototypes on a laptop. Then, just a few years after graduating MIT, Wang noticed technologies like DNA sequencing had develop into low cost and ubiquitous.

“The bottleneck wasn’t sequencing anymore, so people said, ‘Let’s sequence the whole lot,’” Wang recalls. “The limiting factor became computation. People didn’t know what to do with all the information being generated. Biologists were waiting for data scientists and bioinformaticians to assist them, but those people didn’t at all times understand the biology at a deep enough level.”

The situation looked familiar to Wang.

“It was exactly like what we saw in finance, where researchers were attempting to work with engineers, however the engineers never fully understood, and also you had all this inefficiency with people waiting on the engineers,” Wang says. “Meanwhile, I learned the biologists are hungry to run these experiments, but there’s such an enormous gap they felt they’d to develop into a software engineer or simply deal with the science.”

Wang officially founded Watershed in 2019 with physician Mark Kalinich ’13, a former classmate at MIT who is not any longer involved in day-to-day operations of the corporate.

Wang has since heard from biotech and pharmaceutical executives in regards to the growing complexity of biology research. Unlocking latest insights increasingly involves analyzing data from entire genomes, population studies, RNA sequencing, mass spectrometry, and more. Developing personalized treatments or choosing patient populations for a clinical study may also require huge datasets, and there are latest ways to investigate data being published in scientific journals on a regular basis.

Today, firms can run large-scale analyses on Watershed without having to establish their very own servers or cloud computing accounts. Researchers can use ready-made templates that work with all essentially the most common data types to speed up their work. Popular AI-based tools like AlphaFold and Geneformer are also available, and Watershed’s platform makes sharing workflows and digging deeper into results easy.

“The platform hits a sweet spot of usability and customizability for people of all backgrounds,” Wang says. “No science is ever truly the identical. I avoid the word product because that means you deploy something and then you definitely just run it at scale eternally. Research isn’t like that. Research is about coming up with an idea, testing it, and using the final result to give you one other idea. The faster you’ll be able to design, implement, and execute experiments, the faster you’ll be able to move on to the following one.”

Accelerating biology

Wang believes Watershed helps biologists sustain with the newest advances in biology and accelerating scientific discovery in the method.

“In the event you will help scientists unlock insights not just a little bit faster, but 10 or 20 times faster, it may possibly really make a difference,” Wang says.

Watershed is getting used by researchers in academia and in firms of all sizes. Executives at biotech and pharmaceutical firms also use Watershed to make decisions about latest experiments and drug candidates.

“We’ve seen success in all those areas, and the common thread is people understanding research but not being an authority in computer science or software engineering,” Wang says. “It’s exciting to see this industry develop. For me, it’s great being from MIT and now to be back in Kendall Square where Watershed is predicated. That is where a lot of the cutting-edge progress is occurring. We’re attempting to do our part to enable the long run of biology.”

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x