Home Artificial Intelligence Gamifying medical data labeling to advance AI

Gamifying medical data labeling to advance AI

9
Gamifying medical data labeling to advance AI

When Erik Duhaime PhD ’19 was working on his thesis in MIT’s Center for Collective Intelligence, he noticed his wife, then a medical student, spending hours studying on apps that offered flash cards and quizzes. His research had shown that, as a bunch, medical students could classify skin lesions more accurately than skilled dermatologists; the trick was to repeatedly measure each student’s performance on cases with known answers, throw out the opinions of people that were bad at the duty, and intelligently pool the opinions of those that were good.

Combining his wife’s studying habits together with his research, Duhaime founded Centaur Labs, an organization that created a mobile app called DiagnosUs to assemble the opinions of medical examiners on real-world scientific and biomedical data. Through the app, users review anything from images of probably cancerous skin lesions or audio clips of heart and lung sounds that would indicate an issue. If the users are accurate, Centaur uses their opinions and awards them small money prizes. Those opinions, in turn, help medical AI firms train and improve their algorithms.

The approach combines the need of medical examiners to hone their skills with the desperate need for well-labeled medical data by firms using AI for biotech, developing pharmaceuticals, or commercializing medical devices.

“I spotted my wife’s studying could possibly be productive work for AI developers,” Duhaime recalls. “Today we have now tens of 1000’s of individuals using our app, and about half are medical students who’re blown away that they win money within the technique of studying. So, we have now this gamified platform where persons are competing with one another to coach data and winning money in the event that they’re good and improving their skills at the identical time — and by doing that they’re labeling data for teams constructing life saving AI.”

Gamifying medical labeling

Duhaime accomplished his PhD under Thomas Malone, the Patrick J. McGovern Professor of Management and founding director of the Center for Collective Intelligence.

“What interested me was the wisdom of crowds phenomenon,” Duhaime says. “Ask a bunch of individuals what number of jelly beans are in a jar, and the typical of everybody’s answer is pretty close. I used to be interested by the way you navigate that problem in a task that requires skill or expertise. Obviously you don’t just wish to ask a bunch of random people if you might have cancer, but at the identical time, we all know that second opinions in health care will be extremely beneficial. You possibly can consider our platform as a supercharged way of getting a second opinion.”

Duhaime began exploring ways to leverage collective intelligence to enhance medical diagnoses. In a single experiment, he trained groups of lay people and medical school students that he describes as “semiexperts” to categorise skin conditions, finding that by combining the opinions of the very best performers he could outperform skilled dermatologists. He also found that by combining algorithms trained to detect skin cancer with the opinions of experts, he could outperform either method by itself.

“The core insight was you do two things,” Duhaime explains. “The very first thing is to measure people’s performance — which sounds obvious, but even within the medical domain it isn’t done much. If you happen to ask a dermatologist in the event that they’re good, they are saying, ‘Yeah in fact, I’m a dermatologist.’ They don’t necessarily understand how good they’re at specific tasks. The second thing is that once you get multiple opinions, it’s worthwhile to discover complementarities between different people. You could recognize that expertise is multidimensional, so it’s slightly more like putting together the optimal trivia team than it’s getting the five people who find themselves all the very best at the identical thing. For instance, one dermatologist is likely to be higher at identifying melanoma, whereas one other is likely to be higher at classifying the severity of psoriasis.”

While still pursuing his PhD, Duhaime founded Centaur and started using MIT’s entrepreneurial ecosystem to further develop the concept. He received funding from MIT’s Sandbox Innovation Fund in 2017 and took part within the delta v startup accelerator run by the Martin Trust Center for MIT Entrepreneurship over the summer of 2018. The experience helped him get into the distinguished Y Combinator accelerator later that 12 months.

The DiagnosUs app, which Duhaime developed with Centaur co-founders Zach Rausnitz and Tom Gellatly, is designed to assist users test and improve their skills. Duhaime says about half of users are medical school students and the opposite half are mostly doctors, nurses, and other medical professionals.

“It’s higher than studying for exams, where you may have multiple selection questions,” Duhaime says. “They get to see actual cases and practice.”

Centaur gathers hundreds of thousands of opinions every week from tens of 1000’s of individuals around the globe. Duhaime says most individuals earn coffee money, although the one who’s earned probably the most from the platform is a physician in eastern Europe who’s made around $10,000.

“People can do it on the couch, they will do it on the T,” Duhaime says. “It doesn’t feel like work — it’s fun.”

The approach stands in sharp contrast to traditional data labeling and AI content moderation, that are typically outsourced to low-resource countries.

Centaur’s approach produces accurate results, too. In a paper with researchers from Brigham and Women’s Hospital, Massachusetts General Hospital (MGH), and Eindhoven University of Technology, Centaur showed its crowdsourced opinions labeled lung ultrasounds as reliably as experts did. One other study with researchers at Memorial Sloan Kettering showed crowdsourced labeling of dermoscopic images was more accurate than that of highly experienced dermatologists. Beyond images, Centaur’s platform also works with video, audio, text from sources like research papers or anonymized conversations between doctors and patients, and waves from electroencephalograms (EEGs) and electrocardiographys (ECGs).

Finding the experts

Centaur has found that the very best performers come from surprising places. In 2021, to gather expert opinions on EEG patterns, researchers held a contest through the DiagnosUs app at a conference featuring about 50 epileptologists, each with greater than 10 years of experience. The organizers made a custom shirt to provide to the competition’s winner, who they assumed can be in attendance on the conference.

But when the outcomes got here in, a pair of medical students in Ghana, Jeffery Danquah and Andrews Gyabaah, had beaten everyone in attendance. The best-ranked conference attendee had are available ninth.

“I began by doing it for the cash, but I spotted it actually began helping me quite a bit,” Gyabaah told Centaur’s team later. “There have been times within the clinic where I spotted that I used to be doing higher than others due to what I learned on the DiagnosUs app.”

As AI continues to alter the character of labor, Duhaime believes Centaur Labs shall be used as an ongoing check on AI models.

“Straight away, we’re helping people train algorithms primarily, but increasingly I feel we’ll be used for monitoring algorithms and at the side of algorithms, principally serving because the humans within the loop for a spread of tasks,” Duhaime says. “You may consider us less as a solution to train AI and more as an element of the complete life cycle, where we’re providing feedback on models’ outputs or monitoring the model.”

Duhaime sees the work of humans and AI algorithms becoming increasingly integrated and believes Centaur Labs has a very important role to play in that future.

“It’s not only train algorithm, deploy algorithm,” Duhaime says. “As a substitute, there shall be these digital assembly lines all throughout the economy, and you would like on-demand expert human judgment infused somewhere else along the worth chain.”

9 COMMENTS

LEAVE A REPLY

Please enter your comment!
Please enter your name here