Saket Saurabh, CEO and Co-Founding father of Nexla – Interview Series

Saket Saurabh, CEO and Co-Founding father of Nexla, is an entrepreneur with a deep passion for data and infrastructure. He’s leading the event of a next-generation, automated data engineering platform designed to bring scale and velocity to those working with data.

Previously, Saurabh founded a successful mobile startup that achieved significant milestones, including acquisition, IPO, and growth right into a multi-million-dollar business. He also contributed to multiple modern products and technologies during his tenure at Nvidia.

Nexla enables the automation of knowledge engineering in order that data might be ready-to-use. They achieve this through a singular approach of Nexsets – data products that make it easy for anyone to integrate, transform, deliver, and monitor data.

What inspired you to co-found Nexla, and the way did your experiences in data engineering shape your vision for the corporate?

Prior to founding Nexla, I began my data engineering journey at Nvidia constructing highly scalable, high-end technology on the compute side. After that, I took my previous startup through an acquisition and IPO journey within the mobile promoting space, where large amounts of knowledge and machine learning were a core a part of our offering, processing about 300 billion records of knowledge day by day.

the landscape in 2015 after my previous company went public, I used to be in search of the subsequent big challenge that excited me. Coming from those two backgrounds, it was very clear to me that the information and compute challenges were converging because the industry was moving towards more advanced applications powered by data and AI.

While we didn’t know on the time that Generative AI (GenAI) would progress as rapidly because it has, it was obvious that machine learning and AI could be the inspiration for benefiting from data. So I began to take into consideration what form of infrastructure is required for people to achieve success in working with data, and the way we are able to make it possible for anybody, not only engineers, to leverage data of their day-to-day skilled lives.

That led to the vision for Nexla – to simplify and automate the engineering behind data, as data engineering was a really bespoke solution inside most firms, especially when coping with complex or large-scale data problems. The goal was to make data accessible and approachable for a wider range of users, not only data engineers. My experiences in constructing scalable data systems and applications fueled this vision to democratize access to data through automation and simplification.

How do Nexsets exemplify Nexla’s mission to make data ready-to-use for everybody, and why is that this innovation crucial for contemporary enterprises?

Nexsets exemplify Nexla’s mission to make data ready-to-use for everybody by addressing the core challenge of knowledge. The 3Vs of knowledge – volume, velocity, and variety – have been a persistent issue. The industry has made some progress in tackling challenges with volume and velocity. Nonetheless, the range of knowledge has remained a big hurdle because the proliferation of recent systems and applications have led to an ever-increasing diversity in data structures and formats.

Nexla’s approach is to robotically model and connect data from diverse sources right into a consistent, packaged entity, an information product that we call a Nexset. This permits users to access and work with data without having to know the underlying complexity of the varied data sources and structures. A Nexset acts as a gateway, providing an easy, straightforward interface to the information.

That is crucial for contemporary enterprises since it enables more people, not only data engineers, to leverage data of their day-to-day work. By abstracting away the range and complexity of knowledge, Nexsets makes it possible for business users, analysts, and others to directly interact with the information they need, without requiring extensive technical expertise.

We also worked on making integration easy to make use of for less technical data consumers – from the user interface and the way people collaborate and govern data to how they construct transforms and workflows. Abstracting away the complexity of knowledge variety is vital to democratizing access to data and empowering a wider range of users to derive value from their information assets. It is a critical capability for contemporary enterprises in search of to change into more data-driven and leverage data-powered insights across the organization.

What makes data “GenAI-ready,” and the way does Nexla address these requirements effectively?

The reply partly is determined by the way you’re using GenAI. The vast majority of firms are implementing GenAI Retrieval Augmented Generation (RAG). That requires first preparing and encoding data to load right into a vector database, after which retrieving data via search so as to add to any prompt as context as input to a Large Language Model (LLM) that hasn’t been trained using this data. So the information must be prepared in such a approach to work well for each vector searches and for LLMs.

No matter whether you’re using RAG, Retrieval Augmented Effective-Tuning (RAFT) or doing model training, there are just a few key requirements:

Data format: GenAI LLMs often work best with data in a selected format. The information must be structured in a way that the models can easily ingest and process. It must also be “chunked” in a way that helps the LLM higher use the information.
Connectivity: GenAI LLMs must have the ability to dynamically access the relevant data sources, fairly than counting on static data sets. This requires continual connectivity to the varied enterprise systems and data repositories.
Security and governance: When using sensitive enterprise data, it is vital to have robust security and governance controls in place. The information access and usage have to be secure and compliant with existing organizational policies. You furthermore may need to manipulate data utilized by LLMs to assist prevent data breaches.
Scalability: GenAI LLMs might be data- and compute-intensive, so the underlying data infrastructure must have the ability to scale to satisfy the demands of those models.

Nexla addresses these requirements for making data GenAI-ready in just a few key ways:

Dynamic data access: Nexla’s data integration platform provides a single approach to hook up with 100s of sources and uses various integration styles and data speed, together with orchestration, to provide GenAI LLMs essentially the most recent data they need, after they need it, fairly than counting on static data sets.
Data preparation: Nexla has the aptitude to extract, transform and prepare data in formats optimized for every GenAI use case, including built-in data chunking and support for multiple encoding models.
Self-service and collaboration: With Nexla, data consumers not only access data on their very own and construct Nexsets and flows. They will collaborate and share their work via a marketplace that ensures data is in the correct format and improves productivity through reuse.
Auto generation: Integration and GenAI are each hard. Nexla auto-generates a whole lot of the steps needed based on selections by the information consumer – using AI and other techniques – in order that users can do the work on their very own.
Governance and security: Nexla incorporates robust security and governance controls throughout, including collaboration, to be certain that sensitive enterprise data is accessed and utilized in a secure and compliant manner.
Scalability: The Nexla platform is designed to scale to handle the demands of GenAI workloads, providing the obligatory compute power and elastic scale.

Converged integration, self service and collaboration, auto generation, and data governance have to be built together to make data democratization possible.

How do diverse data types and sources contribute to the success of GenAI models, and what role does Nexla play in simplifying the mixing process?

GenAI models need access to every kind of data to deliver the most effective insights and generate relevant outputs. Should you don’t provide this information, you shouldn’t expect good results. It’s the identical with people.

GenAI models have to be trained on a broad range of knowledge, from structured databases to unstructured documents, to construct a comprehensive understanding of the world. Different data sources, reminiscent of news articles, financial reports, and customer interactions, provide invaluable contextual information that these models can leverage. Exposure to diverse data also allows GenAI models to change into more flexible and adaptable, enabling them to handle a wider range of queries and tasks.

Nexla abstracts away the range of all this data with Nexsets, and makes it easy to access nearly any source, then extract, transform, orchestrate, and cargo data so data consumers can focus just on the information, and on making it GenAI ready.

What trends are shaping the information ecosystem in 2025 and beyond, particularly with the rise of GenAI?

Firms have mostly been focused on using GenAI to construct assistants, or copilots, to assist people find answers and make higher decisions. Agentic AI, agents that automate tasks without people being involved, is certainly a growing trend as we move into 2025. Agents, similar to copilots, need integration to be certain that data flows seamlessly–not only in a single direction but additionally in enabling the AI to act on that data.

One other major trend for 2025 is the increasing complexity of AI systems. These systems have gotten more sophisticated by combining components from different sources to create cohesive solutions. It’s just like how humans depend on various tools throughout the day to perform tasks. Empowered AI systems will follow this approach, orchestrating multiple tools and components. This orchestration presents a big challenge but additionally a key area of development.

From a trends perspective, we’re seeing a push toward generative AI advancing beyond easy pattern matching to actual reasoning. There’s a whole lot of technological progress happening on this space. While these advancements may not fully translate into business value in 2025, they represent the direction we’re heading.

One other key trend is the increased application of accelerated technologies for AI inferencing, particularly with firms like Nvidia. Traditionally, GPUs have been heavily used for training AI models, but runtime inferencing—the purpose where the model is actively used—is becoming equally vital. We are able to expect advancements in optimizing inferencing, making it more efficient and impactful.

Moreover, there’s a realization that the available training data has largely been maxed out. This implies further improvements in models won’t come from adding more data during training but from how models operate during inferencing. At runtime, leveraging latest information to boost model outcomes is becoming a critical focus.

While some exciting technologies begin to succeed in their limits, latest approaches will proceed to arise, ultimately highlighting the importance of agility for organizations adopting AI. What works well today could change into obsolete inside six months to a yr, so be prepared so as to add or replace data sources and any components of your AI pipelines. Staying adaptable and open to alter is critical to maintaining with the rapidly evolving landscape.

What strategies can organizations adopt to interrupt down data silos and improve data flow across their systems?

First, people need to just accept that data silos will all the time exist. This has all the time been the case. Many organizations try to centralize all their data in a single place, believing it is going to create a super setup and unlock significant value, but this proves nearly unimaginable. It often turns right into a lengthy, costly, multi-year endeavor, particularly for giant enterprises.

So, the fact is that data silos are here to remain. Once we accept that, the query becomes: How can we work with data silos more efficiently?

A helpful analogy is to take into consideration large firms. No major corporation operates from a single office where everyone works together globally. As a substitute, they split into headquarters and multiple offices. The goal isn’t to withstand this natural division but to make sure those offices can collaborate effectively. That’s why we spend money on productivity tools like Zoom or Slack—to attach people and enable seamless workflows across locations.

Similarly, data silos are fragmented systems that can all the time exist across teams, divisions, or other boundaries. The important thing isn’t to eliminate them but to make them work together easily. Knowing this, we are able to deal with technologies that facilitate these connections.

As an example, technologies like Nexsets provide a typical interface or abstraction layer that works across diverse data sources. By acting as a gateway to data silos, they simplify the technique of interoperating with data spread across various silos. This creates efficiencies and minimizes the negative impacts of silos.

In essence, the strategy ought to be about enhancing collaboration between silos fairly than attempting to fight them. Many enterprises make the error of attempting to consolidate every thing into an enormous data lake. But, to be honest, that’s a virtually unimaginable battle to win.

How do modern data platforms handle challenges like speed and scalability, and what sets Nexla apart in addressing these issues?

The best way I see it, many tools inside the trendy data stack were initially designed with a deal with ease of use and development speed, which got here from making the tools more accessible–enabling marketing analysts to maneuver their data from a marketing platform on to a visualization tool, for instance. The evolution of those tools often involved the event of point solutions, or tools designed to unravel specific, narrowly defined problems.

After we speak about scalability, people often consider scaling when it comes to handling larger volumes of knowledge. But the true challenge of scalability comes from two primary aspects: The increasing number of people that must work with data, and the growing number of systems and kinds of data that organizations need to administer.

Modern tools, being highly specialized, are inclined to solve only a small subset of those challenges. Consequently, organizations find yourself using multiple tools, each addressing a single problem, which eventually creates its own challenges, like tool overload and inefficiency.

Nexla addresses this issue by threading a careful balance between ease of use and adaptability. On one hand, we offer simplicity through features like templates and user-friendly interfaces. Alternatively, we provide flexibility and developer-friendly capabilities that allow teams to constantly enhance the platform. Developers can add latest capabilities to the system, but these enhancements remain accessible as easy buttons and clicks for non-technical users. This approach avoids the trap of overly specialized tools while delivering a broad range of enterprise-grade functionalities.

What truly sets Nexla apart is its ability to mix ease of use with the scalability and breadth required by organizations. Our platform connects these two worlds seamlessly, enabling teams to work efficiently without compromising on power or flexibility.

One in all Nexla’s primary strengths lies in its abstracted architecture. For instance, while users can visually design an information pipeline, the best way that pipeline executes is very adaptable. Depending on the user’s requirements—reminiscent of the source, destination, or whether the information must be real-time—the platform robotically maps the pipeline to one in every of six different engines. This ensures optimal performance without requiring users to administer these complexities manually.

The platform can also be loosely coupled, meaning that source systems and destination systems are decoupled. This permits users to simply add more destinations to existing sources, add more sources to existing destinations, and enable bi-directional integrations between systems.

Importantly, Nexla abstracts the design of pipelines so users can handle batch data, streaming data, and real-time data without changing their workflows or designs. The platform robotically adapts to those needs, making it easier for users to work with data in any format or speed. That is more about thoughtful design than programming language specifics, ensuring a seamless experience.

All of this illustrates that we built Nexla with the tip consumer of knowledge in mind. Many traditional tools were designed for those producing data or managing systems, but we deal with the needs of knowledge consumers that want consistent, straightforward interfaces to access data, no matter its source. Prioritizing the patron’s experience enabled us to design a platform that simplifies access to data while maintaining the flexibleness needed to support diverse use cases.

Are you able to share examples of how no-code and low-code features have transformed data engineering on your customers?

No-code and low-code features have transformed the information engineering process into a really collaborative experience for users. For instance, previously, DoorDash’s account operations team, which manages data for merchants, needed to supply requirements to the engineering team. The engineers would then construct solutions, resulting in an iterative back-and-forth process that consumed a whole lot of time.

Now, with no-code and low-code tools, this dynamic has modified. The day-to-day operations team can use a low-code interface to handle their tasks directly. Meanwhile, the engineering team can quickly add latest features and capabilities through the identical low-code platform, enabling immediate updates. The operations team can then seamlessly use these features without delays.

This shift has turned the method right into a collaborative effort fairly than a creative bottleneck, leading to significant time savings. Customers have reported that tasks that previously took two to 3 months can now be accomplished in under two weeks—a 5x to 10x improvement in speed.

How is the role of knowledge engineering evolving, particularly with the increasing adoption of AI?

Data engineering is evolving rapidly, driven by automation and advancements like GenAI. Many elements of the sector, reminiscent of code generation and connector creation, have gotten faster and more efficient. As an example, with GenAI, the pace at which connectors might be generated, tested, and deployed has drastically improved. But this progress also introduces latest challenges, including increased complexity, security concerns, and the necessity for robust governance.

One pressing concern is the potential misuse of enterprise data. Businesses worry about their proprietary data inadvertently getting used to coach AI models and losing their competitive edge or experiencing an information breach as the information is leaked to others. The growing complexity of systems and the sheer volume of knowledge require data engineering teams to adopt a broader perspective, specializing in overarching system issues like security, governance, and ensuring data integrity. These challenges cannot simply be solved by AI.

While generative AI can automate lower-level tasks, the role of knowledge engineering is shifting toward orchestrating the broader ecosystem. Data engineers now act more like conductors, managing quite a few interconnected components and processes like organising safeguards to forestall errors or unauthorized access, ensuring compliance with governance standards, and monitoring how AI-generated outputs are utilized in business decisions.

Errors and mistakes in these systems might be costly. For instance, AI systems might pull outdated policy information, resulting in incorrect responses, reminiscent of promising a refund to a customer when it isn’t allowed. These kinds of issues require rigorous oversight and well-defined processes to catch and address these errors before they impact the business.

One other key responsibility for data engineering teams is adapting to the shift in user demographics. AI tools aren’t any longer limited to analysts or technical users who can query the validity of reports and data. These tools are actually utilized by individuals at the sides of the organization, reminiscent of customer support agents, who may not have the expertise to challenge incorrect outputs. This wider democratization of technology increases the responsibility of knowledge engineering teams to make sure data accuracy and reliability.

What latest features or advancements might be expected from Nexla as the sector of knowledge engineering continues to grow?

We’re specializing in several advancements to deal with emerging challenges and opportunities as data engineering continues to evolve. One in all these is AI-driven solutions to deal with data variety. One in all the foremost challenges in data engineering is managing the range of knowledge from diverse sources, so we’re leveraging AI to streamline this process. For instance, when receiving data from a whole bunch of various merchants, the system can robotically map it into a typical structure. Today, this process often requires significant human input, but Nexla’s AI-driven capabilities aim to attenuate manual effort and enhance efficiency.

We’re also advancing our connector technology to support the subsequent generation of knowledge workflows, including the power to simply generate latest agents. These agents enable seamless connections to latest systems and permit users to perform specific actions inside those systems. This is especially geared toward the growing needs of GenAI users and making it easier to integrate and interact with quite a lot of platforms.

Third, we proceed to innovate on improved monitoring and quality assurance. As more users devour data across various systems, the importance of monitoring and ensuring data quality has grown significantly. Our aim is to supply robust tools for system monitoring and quality assurance so data stays reliable and actionable at the same time as usage scales.

Finally, Nexla can also be taking steps to open-source a few of our core capabilities. The thought is that by sharing our tech with the broader community, we are able to empower more people to benefit from advanced data engineering tools and solutions, which ultimately reflects our commitment to fostering innovation and collaboration inside the field.

Saket Saurabh, CEO and Co-Founding father of Nexla – Interview Series

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

Speed up StarCoder with 🤗 Optimum Intel on Xeon: Q8/Q4 and Speculative Decoding

a Leaderboard for Real World Use Cases

Patch Time Series Transformer in Hugging Face

Constitutional AI with Open LLMs

Hugging Face Text Generation Inference available for AWS Inferentia2

Saket Saurabh, CEO and Co-Founding father of Nexla – Interview Series

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.