Our Investment in Chroma — The Developer-Centric Embedding Database


VC Astasia Myers’ perspectives on machine learning, cloud infrastructure, developer tools, open source, and security. Join here.

While AI has had an extended history, we’re currently in a ML boom that began with the transition to Deep Learning starting in 2010. Since then we’ve experienced three ML infrastructure phases. The primary wave of ML infrastructure from 2010 to 2015 was reserved for a select few in academia and large publicly traded technology corporations. The second wave of ML infrastructure, from roughly 2016–2020, led to the rise of ML infrastructure vendors that democratized access to tooling with products derived from hyperscalers.

We’re currently within the third wave of ML infrastructure. Software engineers have grow to be ML creators massively increasing the variety of practitioners, and foundational models have significantly lowered the barrier to adopt ML. Generative AI that may produce content like ChatGPT has taken the world by storm!

A critical layer within the AI stack is the embedding database, a database built from the bottom up across the ML workflow with embeddings. Embeddings are dense numerical representations of objects and relationships which might be expressed as a vector. The vector space quantifies the semantic similarity between categories. Embedding vectors which might be close to one another are considered similar. Embeddings could be used to accurately represent unstructured data (resembling image, video, and natural language) or structured data (resembling clickstreams and e-commerce purchases).

While embedding databases originally arose to support use cases like personalization, recommendations, semantic search, and computer vision, generative AI is a big tailwind. Generative models are implausible but are limited by being a centralized and generic knowledge base, making it hard to perform domain-specific or task-specific inference well. Embedding databases enable teams to leverage generative models with their very own data through retrieval-augmented generation to construct tailored, enhanced AI applications.

Throughout the third ML wave, we imagine a latest developer-centric AI stack will emerge to enable the creation of awe-inspiring applications. Winning ML solutions will concentrate on optimizing the top user experience with ease of use, ergonomics, and performance in mind.

Jeff Huber and Anton Troynikov, who’ve direct AI experience from Facebook, Nuro, and Standard Cyborg, founded Chroma with the developer in mind. Using embeddings, Chroma lets developers add state and memory to their AI-enabled applications. Chroma comes “batteries included” with every thing a developer must store, embed, and query data with powerful features like filtering in-built, with more features like automatic clustering and query relevance coming soon. It has each python and typescript APIs and native support for OpenAI and LangChain. Importantly, it runs in each an in-memory, embedded configuration (like DuckDB) in addition to a client-server version. A totally managed hosted version is coming soon.

Developers really love (more, more) it. Launching in February 2023, Chroma has been downloaded +35K times over the past month. It’s super easy to put in so test it out! Just to start.

Today, we’re honored to announce that Quiet Capital led Chroma’s $18M seed round. We’re joined by AIX Ventures, Bloomberg Beta, Nat Friedman and Daniel Gross (AI Grant), Naval Ravikant, Max and Jack Altman, Jordan Tigani (Motherduck), Guillermo Rauch (Vercel), Akshay Kothari (Notion), Amjad Masad (Replit), Spencer Kimball (CockroachDB), amongst others. We’re incredibly excited to support Jeff, Anton, and your complete team to offer open AI infrastructure.

Join the Chroma community! You’ll be able to follow Chroma on Twitter and join their Discord. They’re actively hiring so try opportunities here!


What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
Inline Feedbacks
View all comments

Share this article

Recent posts

Would love your thoughts, please comment.x