normally starts the identical way. In a leadership meeting, someone says: Heads nod, enthusiasm builds, and before you realize it, the room lands on the default conclusion: That instinct is comprehensible. Large language models are powerful, ubiquitous, and engaging. They promise intuitive access to universal knowledge and functionality.
The team walks away and starts constructing. Soon, demo time comes around. A cultured chat interface appears, accompanied by confident arguments about why this time, it’ll be different. At that time, nonetheless, it normally hasn’t reached real users in real situations, and evaluation is biased and optimistic. Someone within the audience inevitably comes up with a custom query, irritating the bot. The developers promise to repair “it”, but generally, the underlying issue is systemic.
Once the chatbot hits the bottom, initial optimism is commonly matched by user frustration. Here, things get a bit personal because over the past weeks, I used to be forced to spend a while talking to different chatbots. I are likely to delay interactions with service providers until the situation becomes unsustainable, and a few these cases had piled up. Smiling chatbot widgets became my last hope before an everlasting hotline call, but:
- After logging in to my automobile insurer’s site, I asked to clarify an unannounced price increase, only to comprehend the chatbot had no access to my pricing data. All it could offer was the hotline number. Ouch.
- After a flight was canceled on the last minute, I asked the airline’s chatbot for the explanation. It politely apologized that, because the departure time was already previously, it couldn’t help me. It was open to debate all other topics, though.
- On a telco site, I asked why my mobile plan had suddenly expired. The chatbot confidently replied that it couldn’t comment on contractual matters and referred me to the FAQs. As expected, these were long but irrelevant.
These interactions didn’t bring me closer to an answer and left me at the other end of pleasure. The chatbots felt like foreign bodies. Sitting there, they consumed real estate, latency, and a spotlight, but didn’t add value.
Let’s skip the controversy on whether these are intentional dark patterns. The very fact is, legacy systems because the above carry a heavy burden of entropy. They arrive with tons of unique data, knowledge, and context. The moment you are attempting to integrate them with a general-purpose LLM, you make two worlds clash. The model must ingest the context of your product so it may possibly reason meaningfully about your domain. Proper context engineering requires skill and time for relentless evaluation and iteration. And before you even get to that time, your data must be ready, but in most organizations, data is noisy, fragmented, or simply missing.
On this post, I’ll recap insights from my book The Art of AI Product Development and my recent talk on the Google Web AI Summit and share a more organic, incremental approach to integrating AI into existing products.
Using smaller models for low-risk, incremental AI integration
AI integration needs time:
- Your technical team needs to organize the info and learn the available techniques and tools.
- It’s good to prototype and iterate to search out the sweet spots of AI value in your product and market.
- Users must calibrate their trust when moving to recent probabilistic experiences.
To adapt to those learning curves, you shouldn’t rush to reveal AI — especially open-ended chat functionality — to your users. AI introduces uncertainty and mistakes into the experience, which most individuals don’t like.
One effective option to pace your AI journey within the brownfield context is through the use of small language models (SLMs), which generally range from a number of hundred million to a number of billion parameters. They will integrate flexibly along with your product’s existing data and infrastructure, slightly than adding more technological overhead.
How SLMs are trained
Most SLMs are derived from larger models through knowledge distillation. On this setup, a big model acts because the teacher and a smaller one as the scholar. For instance, Google’s Gemini served because the teacher for Gemma 2 and Gemma 3 , while Meta’s Llama Behemoth trained its herd of smaller Llama 4 models. Just as a human teacher condenses years of study into clear explanations and structured lessons, the massive model distills its vast parameter space right into a smaller, denser representation that the scholar can absorb. The result’s a compact model that retains much of the teacher’s competence but operates with far fewer parameters and dramatically lower computational cost.
Using SLMs
Considered one of the important thing benefits of SLMs is their deployment flexibility. Unlike LLMs which can be mostly used through external APIs, smaller models might be run locally, either in your organization’s infrastructure or directly on the user’s device:
- Local deployment: You may host SLMs on your individual servers or inside your cloud environment, keeping full control over data, latency, and compliance. This setup is right for enterprise applications where sensitive information or regulatory constraints make third-party APIs impractical.
📈 Local deployment also offers you flexible fine-tuning opportunities as you collect more data and wish to answer growing user expectations.
- On-device deployment via the browser: Modern browsers have built-in AI capabilities that you may depend on. For example, Chrome integrates Gemini Nano via the built-in AI APIs, while Microsoft Edge includes Phi-4 (see Prompt API documentation). Running models directly within the browser enables low-latency, privacy-preserving use cases equivalent to smart text suggestions, form autofill, or contextual help.
In the event you would love to learn more concerning the technicalities of SLMs, listed below are a few useful resources:
Let’s now move on and see what you’ll be able to construct with SLMs to offer user value and make regular progress in your AI integration.
Product opportunities for SLMs
SLMs shine in focused, well-defined tasks where the context and data are already known — the sorts of use cases that live deep inside existing products. You may consider them as specialized, embedded intelligence slightly than general-purpose assistants. Let’s walk through the primary buckets of opportunity they unlock within the brownfield, as illustrated in the next opportunity tree.

1. Higher product analytics
Before exposing AI features to users, look for tactics to enhance your product from the within. Most products already generate a continuous stream of unstructured text — support chats, help requests, in-app feedback. SLMs can analyze this data in real time and surface insights that inform each product decisions and immediate user experience. Listed below are some examples:
- Tag and route support chats as they occur, directing technical issues to the fitting teams.
- Flag churn signals during a session, prompting timely interventions.
- Suggest relevant content or actions based on the user’s current context.
- Detect repeated friction points while the user continues to be within the flow, not weeks later in a retrospective.
These internal enablers keep risk low while adding value and giving your team time to learn. They strengthen your data foundation and prepare you for more visible, user-facing AI features down the road.
2. Remove friction
Next, take a step back and audit UX debt that’s already there. Within the brownfield, most products aren’t exactly a designer’s dream. They were designed under the technical and architectural constraints of their time. With AI, we now have a possibility to lift a few of those constraints, reducing friction and creating faster, more intuitive experiences.
A very good example is the smart filters on search-based web sites like Booking.com. Traditionally, these pages use long lists of checkboxes and categories that attempt to cover every possible user preference. They’re cumbersome to design and use, and in the long run, many users can’t find the setting that matters to them.
Language-based filtering changes this. As an alternative of navigating a fancy taxonomy, users simply type what they need (for instance “pet-friendly hotels near the beach”), and the model translates it right into a structured query behind the scenes.

More broadly, search for areas in your product where users must apply your internal logic — your categories, structures, or terminology — and replace that with natural language interaction. At any time when users can express intent directly, you remove a layer of cognitive friction and make the product smarter and friendlier.
3. Augment
Along with your user experience decluttered, it’s time to take into consideration augmentation — adding small, useful AI capabilities to your product. As an alternative of reinventing the core experience, take a look at what users are already doing around your product — the side tasks, workarounds, or external tools they depend on to achieve their goal. Can focused AI models help them do it faster or smarter?
For instance, a travel app could integrate a contextual trip note generator that summarizes itinerary details or drafts messages for co-travelers. A productivity tool could include a gathering recap generator that summarizes discussions or motion items from text notes, without sending data to the cloud.
These features grow organically from real user behavior and extend your product’s context as an alternative of redefining it.
4. Personalize
Successful personalization is the holy grail of AI. It flips the normal dynamic: as an alternative of asking users to learn and adapt to your product, your product now adapts to them like a well-fitting glove.
Once you start, keep ambition at bay — you don’t need a totally adaptive assistant. Relatively, introduce small, low-risk adjustments in what users see, how information is phrased, or which options appear first. On the content level, AI can adapt tone and magnificence, like using concise wording for experts and more explanatory phrasing for newcomers. On the experience level, it may possibly create adaptive interfaces. For example, a project-management tool could surface probably the most relevant actions (“create task,” “share update,” “generate summary”) based on the user’s past workflows.
Why “small” wins over time
Each successful AI feature — be it an analytics improvement, a frictionless UX touchpoint, or a personalised step in a bigger flow — strengthens your data foundation and builds your team’s iteration muscle and AI literacy. It also lays the groundwork for larger, more complex applications later. When your “small” features work reliably, they develop into reusable components in larger workflows or modular agent systems (cf. Nvidia’s paper Small Language Models are the Way forward for Agentic AI).
To summarize:
✅ Start small — favor gradual improvement over disruption.
✅ Experiment fast — smaller models mean lower cost and faster feedback loops.
✅ Be cautious — start internally; introduce user-facing AI when you’ve validated it.
✅ Construct your iteration muscle — regular, compound progress beats headline projects.
