Helping machines understand visual content with AI

Data should drive every decision a contemporary business makes. But most businesses have an enormous blind spot: They don’t know what’s happening of their visual data.

Coactive is working to vary that. The corporate, founded by Cody Coleman ’13, MEng ’15 and William Gaviria Rojas ’13, has created a synthetic intelligence-powered platform that could make sense of knowledge like images, audio, and video to unlock latest insights.

Coactive’s platform can immediately search, organize, and analyze unstructured visual content to assist businesses make faster, higher decisions.

“In the primary big data revolution, businesses got higher at getting value out of their structured data,” Coleman says, referring to data from tables and spreadsheets. “But now, roughly 80 to 90 percent of the information on the planet is unstructured. In the following chapter of huge data, firms can have to process data like images, video, and audio at scale, and AI is a key piece of unlocking that capability.”

Coactive is already working with several large media and retail firms to assist them understand their visual content without counting on manual sorting and tagging. That’s helping them get the best content to users faster, remove explicit content from their platforms, and uncover how specific content influences user behavior.

More broadly, the founders imagine Coactive serves for instance of how AI can empower humans to work more efficiently and solve latest problems.

“The word coactive means to work together concurrently, and that’s our grand vision: helping humans and machines work together,” Coleman says. “We imagine that vision is more necessary now than ever because AI can either pull us apart or bring us together. We wish Coactive to be an agent that pulls us together and offers human beings a brand new set of superpowers.”

Giving computers vision

Coleman met Gaviria Rojas in the summertime before their first yearthrough the MIT Interphase Edge program. Each would go on to major in electrical engineering and computer science and work on bringing MIT OpenCourseWare content to Mexican universities, amongst other projects.

“That was an excellent example of entrepreneurship,” Coleman recalls of the OpenCourseWare project. “It was really empowering to be accountable for the business and the software development. It led me to begin my very own small web-development businesses afterward, and to take [the MIT course] Founder’s Journey.”

Coleman first explored the ability of AI at MIT while working as a graduate researcher with the Office of Digital Learning (now MIT Open Learning), where he used machine learning to review how humans learn on MITx, which hosts massive, open online courses created by MIT faculty and instructors.

“It was really amazing to me that you would democratize this transformational journey that I went through at MIT with digital learning — and that you would apply AI and machine learning to create adaptive systems that not only help us understand how humans learn, but additionally deliver more personalized learning experiences to people all over the world,” Coleman says of MITx. “That was also the primary time I got to explore video content and apply AI to it.”

After MIT, Coleman went to Stanford University for his PhD, where he worked on lowering barriers to using AI. The research led him to work with firms like Pinterest and Meta on AI and machine-learning applications.

“That’s where I used to be in a position to see across the corner into the long run of what people desired to do with AI and their content,” Coleman recalls. “I used to be seeing how leading firms were using AI to drive business value, and that’s where the initial spark for Coactive got here from. I believed, ‘What if we create an enterprise-grade operating system for content and multimodal AI to make that easy?’”

Meanwhile, Gaviria Rojas moved to the Bay Area in 2020 and commenced working as a knowledge scientist at eBay. As a part of the move, he needed help transporting his couch, and Coleman was the lucky friend he called.

“On the automotive ride, we realized we each saw an explosion happening around data and AI,” Gaviria Rojas says. “At MIT, we got a front row seat to the large data revolution, and we saw people inventing technologies to unlock value from that data at scale. Cody and I spotted we had one other powder keg about to blow up with enterprises collecting tremendous amount of knowledge, but this time it was multimodal data like images, video, audio, and text. There was a missing technology to unlock it at scale. That was AI.”

The platform the founders went on to construct — what Coleman describes as an “AI operating system” — is model agnostic, meaning the corporate can swap out the AI systems under the hood as models proceed to enhance. Coactive’s platform includes prebuilt applications that business customers can use to do things like search through their content, generate metadata, and conduct analytics to extract insights.

“Before AI, computers would see the world through bytes, whereas humans would see the world through vision,” Coleman says. “Now with AI, machines can finally see the world like we do, and that’s going to cause the digital and physical worlds to blur.”

Improving the human-computer interface

Reuters’ database of images supplies the world’s journalists with hundreds of thousands of photos. Before Coactive, the corporate relied on reporters manually entering tags with each photo in order that the best images would show up when journalists looked for certain subjects.

“It was incredible slow and expensive to undergo all of those raw assets, so people just didn’t add tags,” Coleman says. “That meant whenever you looked for things, there have been limited results even when relevant photos were within the database.”

Now, when journalists on Reuters’ website select ‘Enable AI Search,’ Coactive can pull up relevant content based on its AI system’s understanding of the main points in each image and video.

“It’s vastly improving the standard of results for reporters, which enables them to inform higher, more accurate stories than ever before,” Coleman says.

Reuters isn’t alone in struggling to administer all of its content. Digital asset management is a big component of many media and retail firms, who today often depend on manually entered metadata for sorting and looking through that content.

One other Coactive customer is Fandom, which is one in every of the world’s largest platforms for information around TV shows, videogames, and flicks with greater than 300 million monthly energetic users. Fandom is using Coactive to grasp visual data of their online communities and help remove excessive gore and sexualized content.

“It used to take 24 to 48 hours for Fandom to review each latest piece of content,” Coleman says. “Now with Coactive, they’ve codified their community guidelines and may generate finer-grain information in a median of about 500 milliseconds.”

With every use case, the founders see Coactive as enabling a brand new paradigm within the ways humans work with machines.

“Throughout the history of human-computer interaction, we’ve needed to bend over a keyboard and mouse to input information in a way that machines could understand,” Coleman says. “Now, for the primary time, we will just speak naturally, we will share images and video with AI, and it could actually understand that content. That’s a fundamental change in the best way we take into consideration human-computer interactions. The core vision of Coactive is due to that change, we’d like a brand new operating system and a brand new way of working with content and AI.”

Helping machines understand visual content with AI

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

Stop Retraining Blindly: Use PSI to Construct a Smarter Monitoring Pipeline

Easily Construct High quality-Tuning and Evaluation Datasets on the Hub — No Code Required

8 areas with research breakthroughs in 2025

How Agents Plan Tasks with To-Do Lists

A Guardrail for Safety and Adversarial Robustness in Modern LLM Systems

Helping machines understand visual content with AI

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.