How the Generative AI Hype is Influencing the “Traditional” MLOps Stack A renewed emphasis on data Monitoring and validation will grow to be mainstream A resurgence of end-to-end platforms It’s greater than just hype.


In case you’ve been living under a rock since November 2022, the hype surrounding generative AI has reached an all-time high. Products like ChatGPT exposed an enormous number of individuals to the ability of enormous language models (LLMs) and inspired a latest cohort of builders, a lot of whom do not need a background in machine learning, to create ML-powered applications on top of hosted foundation models.

The best way most corporations interact with foundation models is barely different than the standard ML workflow. Fairly than creating the model and the entire pipelines needed to coach it in-house, you just access the model via an API. As an alternative of iterating on the model to get your required result, you’re doing things like prompting the model, tuning your prompts, and feeding your model use-case-specific data. This spurred a latest batch of developer tools to pop up, all with the goal of creating it easier to work with foundation models. Plenty of individuals have mused about this emerging “foundation model ops” (FMOps) stack (the parents at Unusual Ventures put out one among my favorite primers on the space) and early stage VCs are placing bets on interesting latest tools like LangChain, Dust, and others.

These latest picks and shovels are exciting, but I believe many are A number of the challenges which can be uniquely front-and-center when working with foundation models are literally much broader problems in ML that have to be addressed. Sure, Twitter might think you’re boring for talking about “data management” as a substitute of “prompt engineering”, but they’re very similar issues at the top of the day. I might argue that constructing through the lens of the previous relatively than the latter can have an even bigger impact on the ML ecosystem as a complete.

So, let’s zoom out from the generative AI hype and consider a few opportunities that will have been brought into the foreground by FMOps, but will be applied to all of machine learning:

Foundation models, LLMs specifically, have demonstrated that the info you feed your model is just as essential because the model itself. Foundation models are inherently generalist as a result of the sheer size and variety of the info (read: the complete Web) that they’ve been trained on. To be able to get a foundation model to work for a specialized use case, you’ll want to repeatedly prompt it or advantageous tune it with domain-specific data.

Data-centric AI is a subject that I’ve written about previously and has been gaining momentum lately. and produce latest innovation to data management tools for MLOps more broadly.

So far, data labeling has been the middle of the data-centric AI universe, but innovation has been limited to incremental time savings over hand-labeling data. Nevertheless, the info side of AI is about so far more than simply labeling. , and there are already standalone corporations like Cleanlab pursuing this. While data labeling corporations are in an important position to grow to be the hub of the complete data side of AI (Snorkel specifically has leaned heavily into data-centric AI messaging lately), there are numerous other data infrastructure players with ambitions of winning this platform opportunity. Labeling corporations might want to expand their focus beyond being the fastest, least expensive annotator so as to keep their lead.

I also think corporations with platform ambitions might want to help engineering organizations , that are largely separate functions today. There’s some early signs that this is going on. For instance, data labeling company Encord launched an energetic learning toolkit in early 2023 to assist connect the dots between model performance and data, including features like data quality and validation. Model management company Comet released an open source project called Kangas that helps users explore and analyze large-scale datasets. I expect we’ll see more crossover between DataOps and ModelOps within the near future.

Model monitoring shouldn’t be a latest idea. Corporations like Fiddler and Arize have built entire corporations around model monitoring, while other MLOps platforms have incorporated it into their product. Nevertheless, the muse model buzz seems to have put the highlight on model monitoring once more. Perhaps it’s because there’s been a lot public rhetoric concerning the shortcomings and black-box nature of foundation models. Or perhaps it’s because foundation models have drastically sped up the reach of ML, and enabled a much larger (and infrequently less sophisticated) audience to place ML-powered products into production. In any case, it stands to reason that the presence of more ML models within the wild puts in the approaching years.

Today, model monitoring (which happens after a model is put into production) and modeling testing and validation (which happens throughout the model-building process) are operating as somewhat separate markets. Model testing and validation has a robust open source presence (think Python libraries like Deepchecks and Evidently), while monitoring appears to be dominated by industrial players just like the ones I discussed above. It seems likely that each disciplines will grow independently in the approaching years, though I expect that they’ll eventually begin to converge, either with one another or with other parts of the model-building stack, like experiment management.

Related, which today is essentially a priority of the Fortune 500 and controlled corporations, is a growing problem space that ought to slowly, but surely, begin to command more budget. The emergence of LLMs provided a serious boost in public awareness of this issue, with all too many examples of LLM-based conversational products going off the rails. The tooling for this continues to be relatively nascent (products like Credo AI are especially strong of their ability help technical and non-technical stakeholders collaborate on implementing policy) and it’ll probably take more regulation for this market to actually grow to be mainstream (the info privacy market, which got an enormous boost from GDPR, is an important analogy here). Nonetheless, the importance of fair and ethical ML is undoubtedly growing, not shrinking, and is universally applicable no matter whether you’re using a foundation model.

One thing that is exclusive concerning the FMOps stack is that it tends to operate at a better level of abstraction in comparison with the standard MLOps stack. As such, FMOps has the potential to revive the all-in-one ML suite, a model that historically didn’t take hold in “traditional” machine learning.

Within the early days of MLOps, there have been a handful of corporations who tried to supply an end-to-end suite of tools for constructing, managing, and deploying models. This “jack of all trades, master of none” approach struggled to get mainstream adoption, mainly as a result of the sheer complexity of machine learning initiatives, especially once those initiatives began moving into production for business-critical use cases. Because of this, the MLOps stack of today could be very modular, often with disparate vendors offering data management, model experiment management, training, deployment, monitoring, and optimization.

There are some advantages to this, particularly on the more sophisticated end of the market. It’s hard to ascertain a future where these styles of corporations move away from a best-of-breed stack, no matter in the event that they’re using foundation models or their very own. Nevertheless, less AI-native corporations — who, partially due to foundation models and the increased availability of open source models, are constructing ML-powered applications for the primary time — might be a lot better served by a bundled solution. now that the intricacies of model constructing are increasingly being abstracted away, and the marketplace for an all-in-one solution is growing fast. I expect we’ll see more latest entrants starting as all-in-one solutions from day one, in addition to point solutions expanding much faster into other parts of the ML workflow.

The introduction of ChatGPT was a funny moment for long-time ML practitioners. Despite these models having been around for years, it felt like the remaining of the world finally woke up and realized the ability of machine learning. Machine learning is actually a generational technical paradigm shift — as powerful because the introduction of the general public Web or cloud computing. That was true before November 2022, and it’s still true today.

As an investor, I’m incredibly excited concerning the innovation in MLOps and latest applications of machine intelligence. Should you’re constructing something on this area, I’d love to listen to from you. You will discover me on LinkedIn or at


What are your thoughts on this topic?
Let us know in the comments below.


0 0 votes
Article Rating
1 Comment
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

Would love your thoughts, please comment.x