Gentrace, a cutting-edge platform for testing and monitoring generative AI applications, has announced the successful completion of an $8 million Series A funding round led by Matrix Partners, with contributions from Headline and K9 Ventures. This funding milestone, which brings the corporate’s total funding to $14 million, coincides with the launch of its flagship tool, Experiments—an industry-first solution designed to make large language model (LLM) testing more accessible, collaborative, and efficient across organizations.
The worldwide push to integrate generative AI into diverse industries—from education to e-commerce—has created a critical need for tools that ensure AI systems are reliable, protected, and aligned with user needs. Nonetheless, most existing solutions are fragmented, heavily technical, and limited to engineering teams. Gentrace goals to dismantle these barriers with a platform that fosters cross-functional collaboration, enabling stakeholders from product managers to quality assurance (QA) specialists to play an energetic role in refining AI applications.
said Doug Safreno, CEO and co-founder of Gentrace.
Addressing the Challenges of Generative AI Development
Generative AI’s rise has been meteoric, but so have the challenges surrounding its deployment. Models like GPT (Generative Pre-trained Transformer) require extensive testing to validate their responses, discover errors, and ensure safety in real-world applications. In response to market analysts, the generative AI engineering sector is projected to grow to $38.7 billion by 2030, expanding at a compound annual growth rate (CAGR) of 34.2%. This growth underscores the urgent need for higher testing and monitoring tools.
Historically, AI testing has relied on manual workflows, spreadsheets, or engineering-centric platforms that fail to scale effectively for enterprise-level demands. These methods also create silos, stopping teams outside of engineering—similar to product managers or compliance officers—from actively contributing to evaluation processes. Gentrace’s platform addresses these issues through a three-pillar approach:
- Purpose-Built Testing Environments
Gentrace allows organizations to simulate real-world scenarios, enabling AI models to be evaluated under conditions that mirror actual usage. This ensures that developers can discover edge cases, safety concerns, and other risks before deployment. - Comprehensive Performance Analytics
Detailed insights into LLM performance, similar to success rates, error rates, and time-to-response metrics, allow teams to discover trends and constantly improve model quality. - Cross-Functional Collaboration Through Experiments
The newly launched Experiments tool enables product teams, material experts, and QA specialists to directly test and evaluate AI outputs while not having coding expertise. By supporting workflows that integrate with tools like OpenAI, Pinecone, and Rivet, Experiments ensures seamless adoption across organizations.
What Sets Gentrace Apart?
Gentrace’s Experiments tool is designed to democratize AI testing. Traditional tools often require technical expertise, leaving non-engineering teams out of critical evaluation processes. In contrast, Gentrace’s no-code interface allows users to check AI systems intuitively. Key features of Experiments include:
- Direct Testing of AI Outputs: Users can interact with LLM outputs directly inside the platform, making it easier to guage real-world performance.
- “What-If” Scenarios: Teams can anticipate potential failure modes by running hypothetical tests that simulate different input conditions or edge cases.
- Preview Deployment Results: Before deploying changes, teams can assess how updates will impact performance and stability.
- Support for Multimodal Outputs: Gentrace evaluates not only text-based outputs but in addition multimodal results, similar to image-to-text or video processing pipelines, making it a flexible tool for advanced AI applications.
These capabilities allow organizations to shift from reactive debugging to proactive development, ultimately reducing deployment risks and improving user satisfaction.
Impactful Results from Industry Leaders
Gentrace’s modern approach has already gained traction amongst early adopters, including Webflow, Quizlet, and a Fortune 100 retailer. These firms have reported transformative results:
- Quizlet: Increased testing throughput by 40x, reducing evaluation cycles from hours to lower than a minute.
- Webflow: Improved collaboration between engineering and product teams, enabling faster last-mile tuning of AI features.
said Bryant Chou, co-founder and chief architect at Webflow.
Madeline Gilbert, Staff Machine Learning Engineer at Quizlet, emphasized the platform’s flexibility:
A Visionary Founding Team
Gentrace’s leadership team combines expertise in AI, DevOps, and software infrastructure:
- Doug Safreno (CEO): Formerly co-founder of StacksWare, an enterprise observability platform acquired by VMware.
- Vivek Nair (CTO): Built scalable testing infrastructures at Uber and Dropbox.
- Daniel Liem (COO): Experienced in driving operational excellence at high-growth tech firms.
The team has also attracted advisors and angel investors from leading firms, including Figma, Linear, and Asana, further validating their mission and market position.
Scaling for the Future
With the newly raised funds, Gentrace plans to expand its engineering, product, and go-to-market teams to support growing enterprise demand. The event roadmap includes advanced features similar to threshold-based experimentation (automating the identification of performance thresholds) and auto-optimization (dynamically improving models based on evaluation data).
Moreover, Gentrace is committed to enhancing its compliance and security capabilities. The corporate recently achieved ISO 27001 certification, reflecting its dedication to safeguarding customer data.
Gentrace within the Broader AI Ecosystem
The platform’s recent updates highlight its commitment to continuous innovation:
- Local Evaluations and Datasets: Enables teams to make use of proprietary or sensitive data securely inside their very own infrastructure.
- Comparative Evaluators: Supports head-to-head testing to discover the best-performing model or pipeline.
- Production Monitoring: Provides real-time insights into how models perform post-deployment, helping teams spot issues before they escalate.
Partner Support and Market Validation
Matrix Partners’ Kojo Osei underscored the platform’s value:
Jett Fein, Partner at Headline, added:
Shaping the Way forward for Generative AI
As generative AI continues to redefine industries, tools like Gentrace will probably be essential in ensuring its protected and effective implementation. By enabling diverse teams to contribute to testing and development, Gentrace is fostering a culture of collaboration and accountability in AI.