Eric Landau is the CEO & Co-Founding father of Encord, an energetic learning platform for computer vision. Eric was the lead quantitative researcher on a world equity delta-one desk, putting hundreds of models into production. Before Encord, he spent nearly a decade in high-frequency trading at DRW. He holds an S.M. in Applied Physics from Harvard University, M.S. in Electrical Engineering, and B.S. in Physics from Stanford University.
In his spare time, Eric enjoys fidgeting with ChatGPT and huge language models and craft cocktail making.
What inspired you to co-found Encord, and the way did your experience in particle physics and quantitative finance shape your approach to solving the “data problem” in AI?
I first began fascinated about machine learning while working in particle physics and coping with very large datasets during my time on the Stanford Linear Accelerator Center (SLAC). I used to be using software designed for physicists by physicists, which is to say there was lots to be desired by way of a pleasing user experience. With easier tools, I might have been in a position to run analyses much faster.
Later, working in quantitative finance at DRW, I used to be accountable for creating hundreds of models that were deployed into production. Just like my experience in physics, I discovered that high-quality data was critical in making accurate models and that managing complex, large-scale data is difficult. Ulrik had an identical experience visualizing large image datasets for computer vision.
After I heard about his initial idea for Encord, I used to be immediately on board and understood the importance. Together, Ulrik and I saw an enormous opportunity to construct a platform to automate and streamline the AI data development process, making it easier for teams to get one of the best data into models and construct trustworthy AI systems.
Are you able to elaborate on the vision behind Encord and the way it compares to the early days of computing or the web by way of potential and challenges?
Encord’s vision is to be the foundational platform that enterprises depend on to remodel their data into functional AI models. We’re the layer between an organization’s data and their AI.
In some ways, AI mirrors previous paradigm shifts like personal computing and the Web in that it’ll develop into integral to workflows for each individual, business, nation, and industry. Unlike previous technological revolutions, which have been largely bottlenecked by Moore’s law of compounded computational growth of 30x every 10 years, AI development has benefited from simultaneous innovations. It’s thus moving at a much faster pace. Within the words of NVIDIA’s Jensen Huang: “For the very first time, we’re seeing compounded exponentials…We’re compounding at one million times every ten years. Not 100 times, not a thousand times, one million times.” Without hyperbole, we’re witnessing the fastest-moving technology in human history.
The potential here is vast: by automating and scaling the management of high-quality data for AI, we’re addressing a bottleneck stopping broader AI adoption. The challenges are harking back to early-day hurdles in previous technological eras: silos, lack of best practices, limitations for non-technical users, and a shortage of well-defined abstractions.
Encord Index is positioned as a key tool for managing and curating AI data. How does it differentiate itself from other data management platforms currently available?
There are a couple of ways in which Encord Index stands out:
Index is scalable: Allows users to administer billions, not hundreds of thousands, of knowledge points. Other tools face scalability issues for unstructured data and are limited in consolidating all relevant data in a company.
Index is flexible: Integrates directly with private data storage and cloud storage providers comparable to AWS, GCP, and Azure. Unlike other tools which are limited to a single cloud provider or internal storage system, Index is agnostic to where the info is positioned. It allows you to manage data from many sources with appropriate governance and access controls that allow them to develop secure and compliant AI applications.
Index is multimodal: Supports multimodal AI, managing data in the shape of images, videos, audio, text, documents and more. Index is just not limited to a single form of knowledge like many LLM tools today. Human cognition is multimodal, and we consider multimodal AI can be at the guts of the subsequent wave of AI advancements, which can supplant chatbots and LLMs.
In what ways does Encord Index enhance the means of choosing the correct data for AI models, and what impact does this have on model performance?
Encord Index enhances data selection by automating the curation of huge datasets, helping teams discover and retain only essentially the most relevant data while removing uninformative or biased data. This process not only reduces the dimensions of datasets but in addition significantly improves the standard of the info used for training AI models. Our customers have seen as much as a 20% improvement of their models while achieving a 35% reduction in dataset size and saving a whole bunch of hundreds of dollars in compute and human annotation costs.
With the rapid integration of cutting-edge technologies like Meta’s Segment Anything Model, how does Encord stay ahead within the fast-evolving AI landscape?
We intentionally built the platform to have the opportunity to adapt to latest technologies quickly. We concentrate on providing a scalable, software-first approach that easily incorporates advancements like SAM, ensuring that our users are at all times equipped with the most recent tools to remain competitive.
We plan to remain ahead by specializing in multimodal AI. The Encord platform can already manage complex data types comparable to images, videos, and text, in order more advancements in multimodal AI come our way, we’re ready.
What are essentially the most common challenges corporations face when managing AI data, and the way does Encord help address these?
There are 3 major challenges corporations face:
- Poor data organization and controls: As enterprises prepare to implement AI solutions, they are sometimes met with the fact of siloed and unorganized data that is just not AI-ready. This data often lacks strong governance around it, limiting much of it from getting used in AI systems.
- Lack of human experts: As AI models tackle increasingly complex problems, there’ll soon be a shortage of human domain experts to organize and validate data. As an organization’s AI demands increase, scaling that human workforce is difficult and expensive.
- Unscalable tooling: Performant AI models are very data-hungry by way of data needed for fine-tuning, validation, RAG, and other workflows. The previous generation of tools is just not equipped to administer the quantity of knowledge and sorts of data required for today’s production-grade models.
Encord fixes these problems by automating the means of curating data at scale, making it easy to discover impactful data from problematic data and ensuring the creation of effective training and validation datasets. It uses a software-first approach that is straightforward to scale up or down as data management needs change. Our AI-assisted annotation tools empower human-in-the-loop domain experts to maximise workflow efficiency. This process is especially crucial in industries comparable to financial services and healthcare, where AI trainers are costly. We make it easy to administer and understand all of a company’s unstructured data, reducing the necessity for manual labor.
How does Encord tackle the problem of knowledge bias and under-represented areas inside datasets to make sure fair and balanced AI models?
Tackling data bias is a critical focus for us at Encord. Our platform mechanically identifies and surfaces areas where data is perhaps biased, allowing AI teams to deal with these issues before they impact model performance. We also be sure that under-represented areas inside datasets are properly included, which helps in developing fairer and more balanced AI models. By utilizing our curation tools, teams may be confident that their models are trained on diverse and representative data.
Encord recently secured $30 million in Series B funding. How will this funding speed up your product roadmap and expansion plans?
The $30 million in Series B funding can be used to drastically increase the dimensions of our product, engineering, and AI research teams over the subsequent six months and speed up the event of Encord Index and other latest features. We’re also expanding our presence in San Francisco with a brand new office, and this funding will help us scale our operations to support our growing customer base.
Because the youngest AI company from Y Combinator to lift a Series B, what do you attribute to Encord’s rapid growth and success?
One in every of the explanations we now have been in a position to grow quickly is that we now have adopted a particularly customer-centric focus in all areas of the corporate. We’re always communicating with customers, listening closely to their problems, and “bear hugging” them to get to solutions. By hyper-focusing on customer needs fairly than hype, we’ve created a platform that resonates with top AI teams across various industries. Our customers have been instrumental in getting us to where we’re today. Our ability to scale quickly and effectively manage the complexity of AI data has made us a lovely solution for enterprises.
We also owe much of our success to our teammates, partners, and investors, who’ve all worked tirelessly to champion Encord. Working with world-class product, engineering, and go-to-market teams has been enormously impactful in our growth.
Given the increasing importance of knowledge in AI, how do you see the role of AI data platforms like Encord evolving in the subsequent five years?
As AI applications grow in complexity, the necessity for efficient and scalable data management solutions will only increase. I consider that each enterprise will eventually have an AI department, very similar to how IT departments exist today. Encord can be the one platform they should manage the vast amounts of knowledge required for AI and get models to production quickly.