, I’ve kept returning to the identical query: if cutting-edge foundation models are widely accessible, where could durable competitive advantage with AI actually come from?
Today, I would really like to zoom in on context engineering — the discipline of dynamically filling the context window of an AI model with information that maximizes its possibilities of success. Context engineering permits you to encode and pass in your existing expertise and domain knowledge to an AI system, and I consider it’s a crucial component for strategic differentiation. If you could have each unique domain expertise and know how you can make it usable to your AI systems, you’ll be hard to beat.
In this text, I’ll summarize the components of context engineering in addition to one of the best practices which have established themselves over the past yr. One of the critical aspects for achievement is a decent handshake between domain experts and engineers. Domain experts are needed to encode domain knowledge and workflows, while engineers are accountable for knowledge representation, orchestration, and dynamic context construction. In the next, I attempt to elucidate context engineering in a way that is useful to each domain experts and engineers. Thus, we won’t dive into technical topics like context compacting and compression.
For now, let’s assume our AI system has an abstract component — the — which assembles probably the most efficient context for each user interaction. The context builder sits between the user request and the language model executing the request. You’ll be able to consider it as an intelligent function that takes the present user query, retrieves probably the most relevant information from external resources, and assembles the optimal context for it. After the model produces an output, the context builder may store recent information, like user edits and feedback. In this manner, the system accumulates continuity and experience over time.
Conceptually, the context builder must manage three distinct resources:
- Knowledge concerning the domain and specific tasks turns a generic AI system into a website expert.
- Tools allow the agent act in the true world.
- Memory allows the agent to personalize its actions and learn from user feedback.
Because the system matures, you may also find an increasing number of interesting interdependencies between these three components, which may be addressed with proper orchestration.
Let’s dive in and examine these components one after the other. We’ll illustrate them using the instance of an AI system that supports RevOps tasks akin to weekly forecasts.
Knowledge
As you start designing your system, you speak with the Head of RevOps to grasp how forecasting is currently done. She explains:
LLMs include extensive general knowledge from pre-training. They understand what a sales pipeline is and know common forecasting methods. Nonetheless, they will not be aware of your organization’s specifics, akin to:
- Historical close rates by stage and segment
- Average time-in-stage benchmarks
- Seasonality patterns from comparable quarters
- Pricing and discount policies
- Current revenue targets
- Definitions of pipeline stages and probability logic
Without this information, users may have to manually adjust the system’s outputs. They’ll explain that enterprise deals slip more often in Q4, correct expansion assumptions, and remind the model that discount approvals are currently delayed. Soon, they may conclude that the AI system is interesting in itself, but not viable for his or her day-to-day.
Let’s take a look at patterns that let you integrate an AI model with company-specific knowledge. We’ll start with RAG (Retrieval-Augmented Generation) because the baseline and progress towards more structured representations of data.
RAG
In Retrieval-Augmented Generation (RAG), company- and domain-specific knowledge is broken into manageable chunks (confer with this text for an summary of chunking methods). Each chunk is converted right into a text embedding and stored in a database. Text embeddings represent the meaning of a text as a numerical vector. Semantically similar texts are neighbours within the embedding space, so the system can retrieve “relevant” information through similarity search.
Now, when a forecasting request arrives, the system retrieves probably the most similar text chunks and includes them within the prompt:

Conceptually, that is elegant, and each freshly baked B2B AI team that respects itself has a RAG initiative underway. Nonetheless, most prototypes and MVPs struggle with adoption. The naive version of RAG makes several oversimplifying assumptions concerning the nature of enterprise knowledge. It uses isolated text fragments as a source of truth. It assumes that documents are internally consistent. It also strips the complex empirical concept of relevance right down to similarity, which is far handier from the computational standpoint.
In point of fact, text data in its raw form provides a confusing context to AI models. Documents get outdated, policies evolve, metrics are tweaked, and business logic could also be documented in another way across teams. When you want forecasting outputs that leadership can trust, you would like a more intentional knowledge representation.
Articulating knowledge through graphs
Many teams dump their available data into an embedding database without knowing what’s inside. It is a sure recipe for failure. It is advisable to know the semantics of your data. Your knowledge representation should reflect the core objects, processes, and KPIs of the business in a way that’s interpretable each by humans and by machines. For humans, this ensures maintainability and governance. For AI systems, it ensures retrievability and proper usage. The model must not only access information, but additionally understand which source is acceptable for which task.
Graphs are a promising approach because they let you structure knowledge while preserving flexibility. As an alternative of treating knowledge as an archive of loosely connected documents, you model the core objects of your corporation and the relationships between them.
Depending on what you could encode, listed below are some graph types to think about:
- Taxonomies or ontologies that outline core business objects — deals, segments, accounts, reps — together with their properties and relationships
- Canonical knowledge graphs that capture more complex, non-hierarchical dependencies
- Context graphs that record past decision traces and permit retrieval of precedents
Graphs are powerful as a representation layer, and RAG variants akin to GraphRAG provide a blueprint for his or her integration. Nonetheless, graphs don’t grow on trees. They require an intentional design effort — you could determine what the graph encodes, the way it is maintained, and which parts are exposed to the model in a given reasoning cycle. Ideally, you may view this not as a one-off investment, but turn it right into a continuous effort where human users collaborate with the AI system in parallel to their day by day work. It will let you construct its knowledge while engaging users and supporting adoption.
Tools
Forecasting shouldn’t be analytical, but operational and interactive. Your Head of RevOps explains:
To support this workflow, the AI system needs to maneuver beyond reading and generating text. It must find a way to interact with the digital systems where the business actually runs. Tools provide this capability.
Tools make your system agentic — i.e., capable of act in the true world. Within the RevOps setting, tools might include:
- CRM pipeline retrieval (pull open opportunities with stage, amount, close date, owner, and forecast category)
- Forecast rollup calculation (apply company-specific probability and override logic to compute commit, best case, and total pipeline)
- Variance and risk evaluation (compare current forecast to prior periods and discover slippage, concentration risk, or deal dependencies)
- Executive summary generation (translate structured outputs right into a leadership-ready forecast narrative)
- Operational follow-up trigger (create tasks or notifications for high-risk or stale deals)
By hard-coding these actions into tools, you encapsulate business logic that shouldn’t be left to probabilistic guessing. For instance, the model now not must approximate how “commit” is calculated or how variance is decomposed — it just calls the function that already reflects your internal rules. This increases the arrogance and certainty of your system.
How tools are called
The next figure shows the fundamental loop when you integrate tools in your system:

Let’s walk through the method:
- A user sends a request to the LLM, for instance: The context builder injects relevant knowledge (recent pipeline snapshot, forecast definitions, prior totals) and a subset of obtainable tools.
- The LLM decides whether a tool is required. If the query requires structured computation — akin to variance decomposition — it selects the suitable function.
- The chosen tool is executed externally. For instance, the variance evaluation function queries the CRM, calculates deltas (recent deals, slipped deals, closed-won, amount changes), and returns structured output.
- The tool output is added back into the context.
- The LLM generates the ultimate answer. Grounded in a longtime computation, it produces a structured explanation of the forecast change.
Thus, the duty for creating the business logic is offloaded to the experts who write the tools. The AI agent orchestrates predefined logic and reasons over the outcomes.
Choosing the fitting tools
Over time, your inventory of tools will grow. Beyond CRM retrieval and forecast rollups, chances are you’ll introduce renewal risk scoring, expansion modelling, territory mapping, quota tracking, and more. Injecting all of those into every prompt increases complexity and reduces the likelihood that the proper tool is chosen.
The context builder is accountable for managing this complexity. As an alternative of exposing your entire tool ecosystem, it selects a subset based on the duty at hand. A request akin to may require CRM retrieval and rollup logic, while may require variance decomposition and stage movement evaluation.
Thus, tools develop into a part of the dynamic context. To make this work reliably, each tool needs clear, AI-friendly documentation:
- What it does
- When it ought to be used
- What its inputs represent
- How its outputs ought to be interpreted
This documentation forms the contract between the model and your operational logic.
Standardizing the interface between LLMs and tools
Whenever you connect an AI model to predefined tools, you might be bringing together two very different worlds: a probabilistic language model and deterministic business logic. One operates on likelihoods and patterns; the opposite executes precise, rule-based operations. If the interface between them shouldn’t be clearly specified, the interaction becomes fragile.
Standards akin to the Model Context Protocol (MCP) aim to formalize the interface. MCP provides a structured technique to describe and invoke external capabilities, making tool integration more consistent across systems. WebMCP extends this concept by proposing ways for web applications to develop into callable tools inside AI-driven workflows.
These standards matter not just for interoperability, but additionally for governance. They define which parts of your operational logic the model is allowed to execute and under which conditions.
Memory — the important thing to personalized, self-improving AI
Your Head of RevOps takes a person approach to each forecasting cycle:
Thus far, our prompts were stateless. Nonetheless, many generative AI applications need state and memory. There are various different approaches to formalize agent memory. In the long run, the way you construct up and reuse memories is a really individual design decision.
First, determine what form of knowledge from user interactions may be useful:

As shown on this table, the form of knowledge also informs your alternative of a storage format. To further specify it, consider the next two questions:
- Persistence: For the way long should the knowledge be stored? Think of the present session because the short-term memory, and of knowledge that persists from one session to a different because the long-term memory.
- Scope: Who must have access to the memory? Generally, we expect of memories on the user level. Nonetheless, especially in B2B settings, it will possibly make sense to store certain interactions, inputs, and sequences within the system’s knowledge base, allowing other users to profit from it as well.

As your memory store grows, you may increasingly align outputs with how the team actually operates. When you also store procedural memories about execution and outputs (including people who required adjustments), your context builder can step by step improve the way it uses memory over time.
Interactions between the three context components
To cut back complexity, to date, we made a transparent split between the three components of an efficient context — knowledge, tools, and memory. In practice, they are going to interact with one another, especially as your system matures:
- Tools may be defined to retrieve knowledge from different sources and write various kinds of memories.
- Long-term memories may be written back to knowledge sources and be made persistent for future retrieval.
- If a user often repeats a certain task or workflow, the agent may also help them package it as a tool.
The duty of designing and managing these interactions known as orchestration. Agent frameworks like LangChain and DSPy support this task, but they don’t replace architectural pondering. For more complex agent systems, you would possibly determine to go for your personal implementation. Finally, as already said at the start, interaction with humans — especially domain experts — is crucial for making the agent smarter. This requires educated, engaged users, proper evaluation, and a UX that encourages feedback.
Summing up
When you’re starting a RevOps forecasting agent tomorrow, begin by mapping:
- What information sources exist and are used for this task (knowledge)
- Which operations and computations are repetitive and authoritative (tools)
- Which workflows decisions require continuity (memory)
In the long run, context engineering determines whether your AI system reflects how your corporation actually works or merely produces guesses that “sound good” to non-experts. The model is interchangeable, but your unique context shouldn’t be. When you learn to represent and orchestrate it deliberately, you may turn generic AI capabilities right into a durable competitive edge.
