
Salesforce launched a collection of monitoring tools on Thursday designed to resolve what has turn into considered one of the thorniest problems in corporate artificial intelligence: Once corporations deploy AI agents to handle real customer interactions, they often do not know how those agents are making decisions.
The brand new capabilities, built into Salesforce's Agentforce 360 Platform, give organizations granular visibility into every motion their AI agents take, every reasoning step they follow, and each guardrail they trigger. The move comes as businesses grapple with a fundamental tension in AI adoption — the technology guarantees massive efficiency gains, but executives remain wary of autonomous systems they’ll't fully understand or control.
"You may't scale what you’ll be able to't see," said Adam Evans, executive vice chairman and general manager of Salesforce AI, in an announcement announcing the discharge. The corporate says businesses have increased AI implementation by 282% recently, creating an urgent need for monitoring systems that may track fleets of AI agents making real-world business decisions.
The challenge Salesforce goals to deal with is deceptively easy: AI agents work, but nobody knows why. A customer support bot might successfully resolve a tax query or schedule an appointment, however the business deploying it could possibly't trace the reasoning path that led to that final result. When something goes unsuitable — or when the agent encounters an edge case — corporations lack the diagnostic tools to know what happened.
"Agentforce Observability acts as a mission control system to not only monitor, but additionally analyze and optimize agent performance," said Gary Lerhaupt, vice chairman of Salesforce AI who leads the corporate's observability work, in an exclusive interview with VentureBeat. He emphasized that the system delivers business-specific metrics that traditional monitoring tools miss. "In service, this may very well be engagement or deflection rate. In sales, it may very well be leads assigned, converted, or reply rates."
How AI monitoring tools helped 1-800Accountant and Reddit track autonomous agent decision-making
The stakes turn into clear in early customer deployments. Ryan Teeples, chief technology officer at 1-800Accountant, said his company deployed Agentforce agents to function a 24/7 digital workforce handling complex tax inquiries and appointment scheduling. The AI draws on integrated data from audit logs, customer support history, and sources like IRS publications to supply fast responses — without human intervention.
For a financial services firm handling sensitive tax information during peak season, the shortcoming to see how the AI was making decisions could be a dealbreaker. "With this level of sensitive information and the fast pace wherein we move during tax season specifically, Observability allows us to have full trust and transparency with every agent interaction in a single unified view," Teeples said.
The observability tools revealed insights Teeples didn't expect. "The optimization feature has been essentially the most eye opening for us — giving full observability into agent reasoning, identifying performance gaps and revealing how our agents are making decisions," he said. "This has helped us quickly diagnose issues that will've otherwise gone undetected and configure guardrails in response."
The business impact proved substantial. Agentforce resolved over 1,000 client engagements in the primary 24 hours at 1-800Accountant. The corporate now projects it could possibly support 40% client growth this 12 months without recruiting and training seasonal staff, while freeing up 50% more time for CPAs to concentrate on complex advisory work quite than administrative tasks.
Reddit has seen similar results since deploying the technology. John Thompson, vice chairman of sales strategy and operations on the social media platform, said the corporate has deflected 46% of support cases since launching Agentforce for advertiser support. "By observing every Agentforce interaction, we are able to understand exactly how our AI navigates advertisers through even essentially the most complex tools," Thompson said. "This insight helps us understand not only whether issues are resolved, but how decisions are made along the best way."
Inside Salesforce's session tracing technology: Logging every AI agent interaction and reasoning step
Salesforce built the observability system on two foundational components. The Session Tracing Data Model logs every interaction — user inputs, agent responses, reasoning steps, language model calls, and guardrail checks — and stores them securely in Data 360, Salesforce's data platform. This creates what the corporate calls "unified visibility" into agent behavior on the session level.
The second component, MuleSoft Agent Fabric, addresses an issue that may turn into more acute as corporations construct more AI systems: agent sprawl. The tool provides what Lerhaupt describes as "a single pane of glass across every agent," including those built outside the Salesforce ecosystem. Agent Fabric's Agent Visualizer creates a visible map of an organization's entire agent network, giving visibility across all agent interactions from a single dashboard.
The observability tools break down into three functional areas. Agent Analytics tracks performance metrics, surfaces KPI trends over time, and highlights ineffective topics or actions. Agent Optimization provides end-to-end visibility of each interaction, groups similar requests to uncover patterns, and identifies configuration issues. Agent Health Monitoring, which can turn into generally available in Spring 2026, tracks key health metrics in near real-time and sends alerts on critical errors and latency spikes.
Pierre Matuchet, senior vice chairman of IT and digital transformation at Adecco, said the visibility helped his team construct confidence even before full deployment. "Even during early notebook testing, we saw the agent handle unexpected scenarios, like when candidates didn't need to answer questions already covered of their CVs, appropriately and as designed," Matuchet said. "Agentforce Observability helped us discover unanticipated user behavior and gave us confidence, even before the agent went live, that it could act responsibly and reliably."
Why Salesforce says its AI observability tools beat Microsoft, Google, and AWS monitoring
The announcement puts Salesforce in direct competition with Microsoft, Google, and Amazon Web Services, all of which supply monitoring capabilities built into their AI agent platforms. Lerhaupt argued that enterprises need greater than the fundamental monitoring those providers offer.
"Observability comes out-of-the-box standard with Agentforce at no extra cost," Lerhaupt said, positioning the offering as comprehensive quite than supplementary. He emphasized that the tools provide "deeper insight than ever before" by capturing "the complete telemetry and reasoning behind every agentic interaction" through the Session Tracing Data Model, then using that data to "provide key evaluation and session quality scoring to assist customers optimize and improve their agents."
The competitive positioning matters because enterprises face a alternative: construct their AI infrastructure on a cloud provider's platform and use its native monitoring tools, or adopt a specialized observability layer like Salesforce's. Lerhaupt framed the choice as considered one of depth versus breadth. "Enterprises need greater than basic monitoring to measure the success of their AI deployments," he said. "They need full visibility into every agent interaction and decision."
The 1.2 billion workflow query: Are AI agent deployments moving from pilot projects to production?
The broader query is whether or not Salesforce is solving an issue most enterprises will face imminently or constructing for a future that continues to be years away. The corporate's 282% surge in AI implementation sounds dramatic, but that figure doesn't distinguish between production deployments and pilot projects.
When asked about this directly, Lerhaupt pointed to customer examples quite than offering a breakdown. He described a three-phase journey from experimentation to scale. "On Day 0, trust is the muse," he said, citing 1-800Accountant's 70% autonomous resolution of chat engagements. "Day 1 is where designing ideas to turn into real, usable AI," with Williams Sonoma delivering greater than 150,000 AI experiences monthly. "On Day 2, once trust and design are built, it becomes about scaling early wins into enterprise-wide outcomes," pointing to Falabella's 600,000 AI workflows monthly which have grown fourfold in three months.
Lerhaupt said Salesforce has 12,000-plus customers across 39 countries running Agentforce, powering 1.2 billion agentic workflows. Those numbers suggest the shift from pilot to production is already underway at scale, though the corporate didn't provide a breakdown of how many shoppers are running production workloads versus experimental deployments.
The economics of AI deployment may speed up adoption no matter readiness. Corporations face mounting pressure to scale back headcount costs while maintaining or improving service levels. AI agents promise to resolve that tension, but provided that businesses can trust them to work reliably. Observability tools like Salesforce's represent the trust layer that makes scaled deployment possible.
What happens after AI agent deployment: Why continuous monitoring matters greater than initial testing
The deeper story is a couple of shift in how enterprises take into consideration AI deployment. The official announcement framed this clearly: "The agent development lifecycle begins with three foundational steps: construct, test, and deploy. While many organizations have already moved past the initial hurdle of making their first agents, the true enterprise challenge starts immediately after deployment."
That framing reflects a maturing understanding of AI in production environments. Early AI deployments often treated the technology as a one-time implementation — construct it, test it, ship it. But AI agents behave in a different way than traditional software. They learn, adapt, and make decisions based on probabilistic models quite than deterministic code. Which means their behavior can drift over time, or they’ll develop unexpected failure modes that only emerge under real-world conditions.
"Constructing an agent is only the start," Lerhaupt said. "Once the trust is built for agents to start handling real work, corporations may start by seeing the outcomes, but may not understand the 'why' behind them or see areas to optimize. Customers interact with products—including agents—in unexpected ways and to optimize the shopper experience, transparency around agent behavior and outcomes is critical."
Teeples made the identical point more bluntly when asked what could be different without observability tools. "This level of visibility has given full trust in continuing to expand our agent deployment," he said. The implication is obvious: without visibility, deployment would slow or stop. 1-800Accountant plans to expand Slack integrations for internal workflows, deploy Service Cloud Voice for case deflection, and leverage Tableau for conversational analytics—all depending on the boldness that observability provides.
How enterprise AI trust issues became the largest barrier to scaling autonomous agents
The recurring theme in customer interviews is trust, or quite, the shortage of it. AI agents work, sometimes spectacularly well, but executives don't trust them enough to deploy them widely. Observability tools aim to convert black-box systems into transparent ones, replacing faith with evidence.
This matters because trust is the bottleneck constraining AI adoption, not technological capability. The models are powerful enough, the infrastructure is mature enough, and the business case is compelling enough. What's missing is executive confidence that AI agents will behave predictably and that problems will be diagnosed and stuck quickly after they arise.
Salesforce is betting that observability tools can remove that bottleneck. The corporate positions Agentforce Observability not as a monitoring tool but as a management layer—"similar to managers work with their human employees to make sure they’re working towards the proper objectives and optimizing performance," Lerhaupt said.
The analogy is telling. If AI agents have gotten digital employees, they need the identical sort of ongoing supervision, feedback, and optimization that human employees receive. The difference is that AI agents will be monitored with much more granularity than any human employee. Every decision, every reasoning step, every data point consulted will be logged, analyzed, and scored.
That creates each opportunity and obligation. The chance is continuous improvement at a pace unattainable with human staff. The duty is to really use that data to optimize agent performance, not only collect it. Whether enterprises can construct the organizational processes to show observability data into systematic improvement stays an open query.
But one thing has turn into increasingly clear within the race to deploy AI at scale: Corporations that may see what their agents are doing will move faster than those flying blind. Within the emerging era of autonomous AI, observability isn't only a nice-to-have feature. It's the difference between cautious experimentation and assured deployment—between treating AI as a dangerous bet and managing it as a trusted workforce. The query is not any longer whether AI agents can work. It's whether businesses can see well enough to allow them to.
