Natural Language Visualization and the Way forward for Data Evaluation and Presentation

-

has been like classical art. We used to commission a report from our data analyst—our Michelangelo—and wait patiently. Weeks later, we received an email with an impressive hand-carved masterpiece: a link to a 50-KPI dashboard or a 20-page report attached. We could admire the meticulous craftsmanship, but we couldn’t change it. What’s more: we couldn’t even ask follow-up questions. Neither the report nor our analyst, since she was already busy with one other task.

That’s why the longer term of knowledge evaluation doesn’t belong to an ‘analytical equivalent’ of Michelangelo. It might be closer to the art of Fujiko Nakaya.

Source: YouTube.

Fujiko Nakaya is legendary for her fog ‘sculptures’: breathtaking, living clouds of fog. But she doesn’t ‘sculpt’ the fog herself. She has the concept. She designs the concept. The actual, complex work of constructing the pipe systems and programming the water pressure to provide fog is finished by engineers and plumbers.

The paradigm shift of Natural Language Visualization is similar.

Imagine that you could understand a phenomenon: client churn increasing, sales declining, or delivery times not improving. Due to that, you turn into the conceptual artist. You provide the concept:

The system becomes your master technician. It does all of the complex painting, sculpting, or, as in Nakaya’s case, plumbing within the background. It builds the query, chooses visualizations, and writes the interpretation. Finally, the reply, like fog in Nakaya’s sculptures, appears right in front of you.

Do you remember the bridge of the Enterprise starship? When Captain Kirk needed to research a historical figure or Commander Spock needed to cross-reference a brand new energy signature, they never needed to open a posh dashboard. They spoke to the pc (or no less than used the interface and buttons on the captain’s chair) [*].

There was no need to make use of a BI app or write a single line of SQL. Kirk or Spock needed only to state their need: ask an issue, sometimes add a straightforward hand gesture. In return, they received an instantaneous, visual or vocal response. For many years, that fluid, conversational power was pure science fiction.

Today, I ask myself an issue:

Data evaluation is undergoing a major transformation. We’re moving away from traditional software that requires limitless clicking on icons, menus, and windows, learning querying and programming languages or mastering complex interfaces. As an alternative, we’re beginning to have easy conversations with our data.

The goal is to exchange the steep learning curve of complex tools with the natural simplicity of human language. This opens up data evaluation to everyone, not only experts, allowing them to ‘talk with their data.’

At this point, you’re probably skeptical about what I even have written.

And you could have every right to be.

Lots of us have tried using ‘the trendy era’ AI tools for visualizations or presentations, only to seek out the outcomes were inferior to what sometimes even a junior analyst could produce. These outputs were often inaccurate. And even worse: they were hallucinations, distant from the answers we’d like, or are simply incorrect.

This isn’t only a glitch; there are clear reasons for the gap between promise and reality, which we’ll address today.

In this text, I delve right into a recent approach called Natural Language Visualization (NLV). Specifically, I’ll describe how the technology actually works, how we will use it, and what the key challenges are that also have to be solved before we enter our own Star Trek era.

I like to recommend treating this text as a structured journey through our existing knowledge on this topic.

What I discovered within the means of writing this particular piece—and what I hope you’ll discover while reading, too—is that this subject seemed perfectly obvious at first glance. Nonetheless, it quickly revealed a surprising, hidden depth of nuance. Eventually, after reviewing all of the cited and non-cited sources, my very own reflections, and thoroughly balancing the facts, I arrived at a reasonably unexpected conclusion. Taking this systemic, academic-like approach was a real eye-opener in some ways, and I hope it’s going to be for you as well.

What’s Natural Language Visualization?

A critical barrier to understanding this field is the anomaly of its core terminology. The acronym NLV (Natural Language Visualization) carries two distinct, historical meanings.

  • Historical NLV (Text-to-Scene): The older field of generating 2D or 3D graphics from descriptive text [1],[2].
  • Modern NLV (Text-to-Viz): The contemporary field of generating data visualizations (like charts) from descriptive text [3].

To take care of precision and help you cross-reference ideas and evaluation presented in this text, I’ll use a particular academic methodology utilized in the HCI and visualization communities:

  • Natural Language Interface (NLI): Broad, overarching term for any human-computer interface that accepts natural language as an input.
  • Visualization-oriented Natural Language Interface (V-NLI): It’s a system that enables users to interact with and analyze visual data (like charts and graphs) using on a regular basis speech or text. Its major purpose is to democratize data by serving as a simple, complementary input method for visual analytics tools, ultimately letting users focus entirely on their data tasks somewhat than grappling with the technical operation of complex visualization software [4],[5].

V-NLIs are interactive systems that facilitate visual analytics tasks through two primary user interfaces: form-based or chatbot-based. A form-based V-NLI typically uses a text box for natural language queries, sometimes with refinement widgets, but is usually not designed for conversational follow-up questions. In contrast, a chatbot-based V-NLI encompasses a named agent with anthropomorphic traits—comparable to personality, appearance, and emotional expression—that interacts with the user in a separate chat window, displaying the conversation alongside complementary outputs. While each are interactive, the chatbot-based V-NLI can also be anthropomorphic, possessing all of the defined chatbot characteristics, whereas the form-based V-NLI lacks the human-like traits [6].

The worth proposition of V-NLIs is best understood by contrasting the conversational paradigm with traditional data evaluation workflows. These are presented within the infographic below.

Source: Image by the creator based on [5], [7] – [10]. Images within the upper section of the image were generated in ChatGPT.

This shift represents a move from a static, high-friction, human-gated process to a dynamic, low-friction, automated one. I further illustrate how this recent approach could impact how we work with data in Table 1.

Table 1: Comparative Evaluation: Traditional BI vs. Conversational Analytics

Feature Conversational Analytics Traditional Analytics
Focus All customer-agent interactions and CRM data Phone conversations and customer profiles
Data Sources Recent conversations across calls, chat, text, and emails Historical records (sales, customer profiles)
Timing Real-Time / Recent Retrospective / Historical
Immediacy High (analyzes very recent data) Low (insights developed over longer periods)
Insights Deep understanding of specific pain points, emerging issues High-level contact center insights over time
Use Case Improving immediate customer satisfaction, agent behavior Understanding long-term trends and business dynamics
Source: Table by the creator based on and inspired by [8].

How does V-NLI work?

To investigate the V-NLI mechanics, I adopted the theoretical framework from the educational survey ‘’ [11]. This framework offers a robust lens for classifying and critiquing V-NLI systems by distinguishing between user intent and dialogue implementation. It dissects two major axes of the V-NLI system: ‘The Why’ and ‘How’. ‘The Why’ axis represents user intent. It examines why users interact with visualizations. The ‘How’ axis represents dialogue structure. It answers the query of how the human-machine dialogue is technically implemented. Each of those axes will be further divided into specific tasks within the case of ‘Why’ and attributes within the case of ‘How’. I list them below.

The 4 key high-level ‘Why’ tasks are:

  1. Present: Using visualization to speak a narrative, as an illustration, for visual storytelling or explanation generation.
  2. Discover: Using visualization to seek out recent information, as an illustration, writing natural language queries, performing keyword search, visual query answering (VQA), or analytical conversation.
  3. Enjoy: Using visualization for non-professional goals, comparable to augmentation of images or description generation.
  4. Produce: Using visualization to create or record recent artifacts, as an illustration, by making annotations or creating additional visualizations.

The ‘How,’ however, has three major attributes:

  1. Initiative: Answers who drives the conversation. It might probably be user-initiated, system-initiated, or mixed-initiated.
  2. Duration: How long is the interaction? It is likely to be a single turn for a straightforward query, or a multi-turn conversation for a posh analytical discussion.
  3. Communicative Functions: What’s the type of the language? The language model supports several interaction forms: users may issue direct commands, pose questions, or engage in a responsive dialogue during which they modify their input based on suggestions from the NLI.

This framework may also help illustrate probably the most fundamental issue causing our disbelief in NLI. Historically, each industrial and non-commercial Visual Natural Language Interfaces (V-NLIs) operated inside a really narrow functional scope. The ‘Why’ was often reduced to Discover task, while the ‘How’ was limited to easy, single-turn queries initiated by the user.

Because of this, most ‘talk-to-your-data’ tools functioned as little greater than basic ‘ask me an issue’ search boxes. This model has proven consistently frustrating for users since it is overly rigid and brittle, often failing unless a question is phrased with perfect precision.

Your complete history of this technology is the story of growth in two key ways.

  • First, our interactions have been improving, moving from asking only one query at a time to having a full, back-and-forth conversation.
  • Second, the explanations for using V-NLIs have been expanding. We now have progressed from simply finding information to having the tool robotically create recent charts for us, and even explain the info in a written story.

Working using fully all 4 tasks of ‘Why’ and three attributes of ‘How’ in the longer term will probably be the largest leap of all. The system will stop waiting for us to ask an issue and can start the conversation itself, proactively mentioning insights you will have missed. This journey, from a straightforward search box to a sensible, proactive partner, is the major story connecting this technology’s past, present, and future.

Before going further, I would really like to make a small course deviation and show you an example of how our interactions with AI could improve. For that purpose I’ll use a recent post published by my friend Kasia Drogowska, PhD, on LinkedIn.

AI models often turn into stereotyped, affected by ‘mode collapse’ because they learn our own biases from their training data. A method called ‘Verbalized Sampling’ (VS) offers a robust solution by changing the prompt. As an alternative of asking for one answer (like ‘Tell me a joke’), you ask for a probability distribution of answers (like ‘Generate five different jokes and their probabilities’). This easy shift not only yields 1.6-2.1x more diverse and inventive results but, more importantly, it teaches us to think probabilistically. It shatters the illusion of a single ‘correct answer’ in complex business decisions and puts the ability of alternative back in our hands, not the model’s.

Source: image by the creator based on [12]. Answers generated in Gemini 2.5.

The image above displays a direct comparison between two AI prompting methods:

  • The left side exemplifies direct prompting. On this side I show what happens once you ask the AI the identical easy query five times: ‘Tell me a joke about data visualization.’ The result’s five very similar jokes, all following the identical format.
  • The fitting side exemplifies verbalized sampling. Here I show a unique prompting method. The query is modified to ask for a variety of answers: ‘Generate five responses with their corresponding probabilities…’ The result’s five completely different jokes, each unique in its setup and punchline, and every assigned a probability by the AI (as a matter of fact, it will not be true probability, but anyway gives you the concept).

The important thing good thing about a technique like VS is diversity. As an alternative of just getting the AI’s single ‘default’ answer, it forces the AI to explore a wider spectrum of creative possibilities, letting you pick from probably the most common to probably the most unique. It is a perfect example of my point: changing how we interact with these tools can yield very different results.

The V-NLI pipeline

To know how a V-NLI translates a natural language query, comparable to ‘show me last quarter’s sales trend,’ right into a precise and accurate data visualization, it’s obligatory to deconstruct its underlying technical architecture. Academics within the V-NLI community have proposed a classic information visualization pipeline as a structured model for these systems [5]. As an instance the overall mechanism of the method, I prepared the next infographic.

Source: Image by the creator based on [5]. Concept for the infographic created in Gemini. Icons and graphics generated by the creator in Gemini.

For a single ‘text-to-viz’ query, the 2 most important and difficult stages are (1) Query Interpretation and (3/4) Visual mapping/encoding. In other words, it is knowing exactly what the user means. The opposite stages, particularly (6) Dialogue Management, turn into paramount in additional advanced conversational systems.

The older systems consistently failed to know this understanding. The explanation is that this task is actually solving two problems immediately:

  • First, the system must guess the user’s intent (e.g., is the request to check sales or to see a trend?).
  • Second, it must translate casual words (like ‘best sellers’) right into a perfect database query.

If the system misunderstood the user’s intent, it could display a table when the user wanted a chart. If it couldn’t parse user’s words, it could just return an error, or worse, make up something out of the blue.

Once the system understands your query, it must create the visual answer. It should robotically select one of the best chart for the given intent (e.g., a line chart for a trend) after which map appropriate characteristics to it (e.g., placing ‘Sales’ on the Y-axis and ‘Region’ on the X-axis). Interestingly, this chart-building part evolved in an analogous solution to the language-understanding part. Each transitioned from old, clunky, hard-coded rules to flexible, recent AI models. This parallel evolution set the stage for contemporary Large Language Models (LLMs), which may now perform each tasks concurrently.

Actually, the complex, multi-stage V-NLI pipeline described above, with its distinct modules for intent recognition, semantic parsing, and visual encoding, has been significantly disrupted by the appearance of LLMs. These models have not only improved one stage of the pipeline; they’ve collapsed the whole pipeline right into a single, generative step.

Why is that, it’s possible you’ll ask? Well, the parsers of the previous era were algorithm-centric. They required years of effort by computational linguists and developers to construct, and they’d break upon encountering a brand new domain or an unexpected query.

LLMs, in contrast, are data-centric. They provide a pre-trained, simplified solution to probably the most difficult problem in understanding natural language [13],[14]. That is the nice unification: a single, pre-trained LLM can now execute all of the core tasks of the V-NLI pipeline concurrently. This architectural revolution has triggered an equivalent revolution within the V-NLI developer’s workflow. The core engineering challenge has undergone a fundamental shift. Previously, the challenge was to construct an ideal, domain-specific semantic parser [11]. Now, the brand new challenge is to create the best prompt and curate the right data to guide a pre-trained LLM.

Three key techniques power this recent, LLM-centric workflow. The primary is Prompt Engineering, a brand new discipline focused on rigorously structuring the text prompt—sometimes using advanced strategies like ‘Tree-of-Thoughts’—to assist the LLM reason through a posh data query as an alternative of just making a fast guess. A related method is In-Context Learning (ICL), which primes the LLM by placing a number of examples of the specified task (like sample text-to-chart pairs) directly into the prompt itself. Finally, for highly specialized fields, Superb-Tuning is used. This involves re-training the bottom LLM on a big, domain-specific dataset. These pillars, when in place, enable the creation of a robust V-NLI that may handle complex tasks and specialized charts that may be unimaginable for any generic model.

Image generated by the creator in Gemini, further edited and corrected in Microsoft PowerPoint.

This shift has profound implications for the scalability of V-NLI systems. The old approach (symbolic parsing) required constructing recent, complex algorithms for each recent domain. The newest LLM-based approach requires a brand new dataset for fine-tuning. While creating high-quality datasets stays a major challenge, it’s a data-scaling problem that’s way more solvable and economical than the previous algorithmic-scaling problem. This transformation in fundamental scaling economics is the true and most lasting impact of the LLM revolution.

What’s the true meaning of this?

The one biggest promise of ‘talk-to-your-data’ tools is data democratization. They’re designed to eliminate the steep learning curve of traditional, complex BI software, which regularly requires extensive training. ‘Talk-to-your-data’ tools provide a zero-learning-curve entry point for non-technical professionals (like managers, marketers, or sales teams) who can finally get their very own insights without having to file a ticket with an IT or data team. This fosters a data-driven culture by enabling self-service for common, high-value questions.

For the business, value is measured by way of speed and efficiency. The choice lag of waiting for an analyst, lasting days or sometimes weeks, is eliminated. This shift from a multi-day, human-gated process to a real-time, automated one saves a mean of 2-3 hours per user per week, allowing the organization to react to market changes immediately.

Nonetheless, this democratization creates a brand new and profound socio-technical tension inside organizations. The below anecdote illustrates this perfectly:

This reveals the critical conflict: the tool’s primary value is in direct tension with the organization’s fundamental need for governance and trust. When a non-technical user is suddenly empowered to provide complex analytics, it challenges the authority of the standard data gatekeepers, making a conflict that may be a direct consequence of the technology’s success.

Classic and modern art together… Photo by Serena Repice Lentini on Unsplash.

Which current LLM-based AI assistant is one of the best as a ‘talk-to-your-data’ tool?

You may expect to see a rating of one of the best assistants using LLMs for V-NLI here, but I selected not to incorporate one. With quite a few tools available, it’s unimaginable to review all of them and rank them objectively and in a reliable manner.

My very own experience is especially with Gemini, ChatGPT, and built-in assistants like Microsoft Copilot or Google Workspace. Still, using a number of online sources, I’ve put together a transient overview to spotlight the important thing aspects you must evaluate when choosing the choice that’s best suited for you. Ultimately, you’ll must explore the chances yourself and consider features comparable to performance, cost, payment model, and—above all—safety.

The table below outlines several tools with short descriptions. Later, I focus especially on Gemini and ChatGPT, which I do know best.

Table 2. Examples of LLMs that would function V-NLI

BlazeSQL An AI data analyst and chatbot that connects to SQL databases, letting non-technical users ask questions in natural language, visualize results, and construct interactive dashboards. There isn’t any coding required.
DataGPT A conversational analytics tool that answers natural language queries with visualizations, detects anomalies, and offers features like an AI onboarding agent and Lightning Cache for rapid query processing.
Gemini (Google) Google Cloud’s conversational AI interface for BigQuery, enables easy data evaluation, real-time insights, and customizable dashboards through on a regular basis language.
ChatGPT (OpenAI) A versatile conversational tool able to exploring datasets, running basic statistical evaluation, generating charts, and producing custom reports, all via natural language interaction.
Lumenore A platform focused on personalized insights and faster decision-making, with scenario evaluation, an organizational data dictionary, predictive analytics, and centralized data management.
Dashbot A tool designed to deal with the ‘dark data’ challenge by analyzing each unstructured data (e.g., emails, transcripts, logs) and structured data to show previously unused information into actionable insights.
Source: table by the creator based on [15].

Each Gemini and ChatGPT exemplify the brand new wave of powerful, visualization-oriented V-NLIs, each with a definite strategic advantage. Gemini’s primary bonus is its deep integration throughout the Google ecosystem; it really works directly with BigQuery and Google Suite. For instance, you possibly can open a PDF attachment directly from Gmail and perform a deep evaluation using the Gemini assistant interface, using either a pre-built agent or ad-hoc prompts. Its core strength lies in translating easy, on a regular basis language not only into data points, but directly into interactive visualizations and dashboards.

ChatGPT, in contrast, can function a more general-purpose yet equally powerful V-NLI for analytics, able to handling various data formats, comparable to CSVs and Excel files. This makes it a great tool for users who need to make informed decisions without diving into complex software or coding. Its Natural Language Visualization (NLV) function is explicit, allowing users to ask it to summarize data, discover patterns, and even generate visualizations.

The true, shared strength of each platforms is their ability to handle interactive conversations. They permit users to ask follow-up questions and refine their queries. This iterative, conversational approach makes them highly effective V-NLIs that don’t just answer a single query, but enable a full, exploratory data evaluation workflow.

Application example: Gemini as V-NLI

Let’s do a small experiment and see, step-by-step, how Gemini (version 2.5 Pro) works as a V-NLI. For the aim of this experiment, I used Gemini to generate a set of artificial day by day sales data, split by product, region, and sales representative. Then I asked it to simulate an interaction between a non-technical user (e.g., a sales manager) and a V-NLI. Let’s see what the end result was.

Generated data sample:

Date,Region,Salesperson,Product,Category,Quantity,UnitPrice,TotalSales
2022-01-01,North,Alice Smith,Alpha-100,Electronics,5,1500,7500
2022-01-01,South,Bob Johnson,Beta-200,Electronics,3,250,750
2022-01-01,East,Carla Gomez,Gamma-300,Apparel,10,50,500
2022-01-01,West,David Lee,Delta-400,Software,1,1000,1000
2022-01-02,North,Alice Smith,Beta-200,Electronics,2,250,500
2022-01-02,West,David Lee,Gamma-300,Apparel,7,50,350
2022-01-03,East,Carla Gomez,Alpha-100,Electronics,3,1500,4500
2022-01-03,South,Bob Johnson,Delta-400,Software,2,1000,2000
2023-05-15,North,Eva Green,Alpha-100,Electronics,4,1600,6400
2023-05-15,East,Frank White,Epsilon-500,Services,1,5000,5000
2023-05-16,South,Bob Johnson,Beta-200,Electronics,5,260,1300
2023-05-16,West,David Lee,Gamma-300,Apparel,12,55,660
2023-05-17,North,Alice Smith,Delta-400,Software,1,1100,1100
2023-05-17,East,Carla Gomez,Epsilon-500,Services,1,5000,5000
2024-11-20,South,Grace Hopper,Alpha-100,Electronics,6,1700,10200
2024-11-20,West,David Lee,Beta-200,Electronics,10,270,2700
2024-11-21,North,Eva Green,Gamma-300,Apparel,15,60,900
2024-11-21,East,Frank White,Delta-400,Software,3,1200,3600
2024-11-22,South,Grace Hopper,Epsilon-500,Services,2,5500,11000
2024-11-22,West,Alice Smith,Alpha-100,Electronics,4,1700,6800

Experiment:

My typical workflow starts with a high-level query for a broad overview. If that initial view looks normal, I would stop. Nonetheless, if I think an underlying issue, I’ll ask the tool to dig deeper for anomalies that aren’t visible on the surface.

Source: print screen by the creator.
Source: image generated by Gemini.

Next, I focused on the North region to see if I could spot any anomalies.

Source: print screen by the creator.
Source: image generated by Gemini.

For the last query, I shifted my perspective to investigate the day by day sales progression. This recent view serves as a launchpad for subsequent, more detailed follow-up questions.

Source: print screen by the creator.
Source: image generated by Gemini.

As a matter of fact, the above examples were fairly easy and never distant from the ‘Old-era’ NLIs. But let’s see what happens, if the chatbot is empowered to take initiative throughout the discussion.

Source: print screen by the creator.
Source: print screen by the creator.

This demonstrates a more advanced V-NLI capability: not only answering the query, but additionally providing context and identifying underlying patterns or outliers that the user might need missed.

Source: image generated by Gemini.

This small experiment hopefully demonstrates that AI assistants, comparable to Gemini, can effectively function V-NLIs. The simulation began with the model successfully interpreting a high-level natural-language query about sales data and translating it into an appropriate visualization. The method showcased the model’s ability to handle iterative, conversational follow-ups, comparable to drilling down into a particular data segment or shifting the analytical perspective to a time series. Most importantly, the ultimate experiment demonstrated proactive capability, during which the model not only answered the user’s query but additionally independently identified and visualized a critical data anomaly. This means that such AI tools can transcend the role of easy executors, acting as an alternative as interactive partners in the info exploration process. However it’s not that they are going to try this on their very own: they need to first be empowered through an appropriate prompt.

So is that this world really so ideal?

Despite the promise of democratization, V-NLI tools are stricken by fundamental challenges which have led to their past failures. The primary and most important is the Ambiguity Problem, the ‘Achilles’ heel’ of all natural language systems. Human language is inherently imprecise, which manifests in several ways:

  • Linguistic ambiguity: Words have multiple meanings. A question for ‘top customers’ could mean top by revenue, volume, or growth, and a unsuitable guess immediately destroys user trust.
  • Under-specification: Users are sometimes vague, asking ‘show me sales’ without specifying the time-frame, granularity, or analytical intent (comparable to a trend versus a complete).
  • Domain-specific context: A generic LLM is likely to be useless for a particular business since it doesn’t understand internal jargon or company-specific business logic [16], [17].

Second, even when a tool provides an accurate answer, it’s socially useless if the user cannot trust it. That is the ‘Black Box’ problem, as cited above within the story of the HR business partner. Since the HR user couldn’t explain the ‘why’ behind the ‘what,’ the insight was rejected. This ‘chain of trust’ is critical. When the V-NLI is an opaque black box, the user becomes a ‘data parrot,’ unable to defend the numbers and rendering the tool unusable in any high-stakes business context.

Finally, there may be the ‘Last Mile’ problem of technical and economic feasibility. A user’s simple-sounding query (e.g., ‘show me the lifetime value of shoppers from our last campaign’) may require a hyper-complex, 200-line SQL query that no current AI can reliably generate. LLMs should not a magic fix for this. Even to be remotely useful, they need to be trained on a company-specific, prepared, cleaned, and properly described dataset. Unfortunately, this continues to be an unlimited and recurring expense. This results in an important conclusion:

The one viable path forward is a hybrid future.

An ungoverned ‘ask anything box’ is a no-go.

The longer term of V-NLI will not be a generic, all-powerful LLM; it is a versatile LLM (for language) operating on top of a rigid, curated semantic model (for governance, accuracy, and domain-specific knowledge) [18], [19]. As an alternative of ‘killing’ BI and dashboards, LLMs and V-NLI will probably be the alternative: a robust catalyst. They won’t replace the dashboard or static report. They are going to enhance it. We must always expect them to be integrated as the subsequent generation of user interface, dramatically improving the standard and utility of knowledge interaction.

Image generated by the creator in Gemini.

What’s going to the longer term bring?

The longer term of knowledge interaction points toward a hypothetical paradigm shift, moving well beyond a straightforward search box to a Multi-Modal Agentic System. Imagine a system that operates more like a collaborator and fewer like a tool. A user, perhaps wearing an AR/VR headset, might ask, ‘Why did our last campaign fail?’ Then the AI agent would reason over all available data. Not only the sales database, but additionally unstructured customer feedback emails, the ad creative images themselves, and website logs. As an alternative of a straightforward chart, it could proactively present an augmented reality dashboard and offer a predictive conclusion, comparable to, ‘The creative performed poorly together with your goal demographic, and the landing page had a 70% bounce rate.’ The crucial evolution is the ultimate ‘agentic’ step: the system wouldn’t stop on the insight but would bridge the gap to motion, perhaps concluding:

Y/N_

As scary as it might sound, this vision completes the evolution from simply ‘talking to data’ to actively ‘collaborating with an agent about data’ to realize an automatic, real-world end result [20].

I realize this last statement opens up much more questions, but this looks like the correct place to pause and switch the conversation over to you. I’m wanting to hear your opinions on this. Is a future like this realistic? Is it exciting, or frankly, a bit of scary? And on this advanced agentic system, is that final human ‘yes or no’ truly obligatory? Or is it the security mechanism we’ll all the time want / must keep? I sit up for the discussion.

Concluding remarks

So, will conversational interaction make the info analyst—the one who painstakingly writes queries and manually builds charts—jobless? My conclusion is that the query isn’t about substitute but redefinition.

The pure ‘Star Trek’ vision of an ‘ask anything’ box won’t occur. It’s stricken by its ‘Achilles’ heel’ of human language ambiguity and the ‘Black Box’ problem that destroys the trust it needs to operate. Hence, the longer term, subsequently, will not be a generic, all-powerful LLM.

As an alternative, the one viable path forward is a hybrid system that mixes the pliability of an LLM with the rigidity of a curated semantic model. This recent paradigm doesn’t replace the analysts; it elevates them. It frees them from being a ‘data plumber’. It empowers them as a strategic partner, working with a brand new, multi-modal agentic system that may finally bridge the chasm between data, insight, and automatic motion.

References

[1] Priyanka Jain, Hemant Darbari, Virendrakumar C. Bhavsar, Vishit: A Visualizer for Hindi Text – ResearchGate

[2] Christian Spika, Katharina Schwarz, Holger Dammertz, Hendrik Lensch, AVDT – Automatic Visualization of Descriptive Texts

[3] Skylar Walters, Arthea Valderrama, Thomas Smits, David Kouřil, Huyen Nguyen, Sehi L’Yi, Devin Lange, Nils Gehlenborg, GQVis: A Dataset of Genomics Data Questions and Visualizations for Generative AI

[4] Rishab Mitra, Arpit Narechania, Alex Endert, John Stasko, Facilitating Conversational Interaction in Natural Language Interfaces for Visualization

[5] Shen Leixian, Shen Enya, Luo Yuyu, Yang Xiaocong, Hu Xuming, Zhang Xiongshuai, Tai Zhiwei, Wang Jianmin, Towards Natural Language Interfaces for Data Visualization: A Survey – PubMed

[6] Ecem Kavaz, Anna Puig, Inmaculada Rodríguez, Chatbot-Based Natural Language Interfaces for Data Visualisation: A Scoping Review

[7] Shah Vaishnavi, What’s Conversational Analytics and How Does it Work? – ThoughtSpot

[8] Tyler Dye, How Conversational Analytics Works & The best way to Implement It – Thematic

[9] Apoorva Verma, Conversational BI for Non-Technical Users: Making Data Accessible and Actionable

[10] Ust Oldfield, Beyond Dashboards: How Conversational AI is Transforming Analytics

[11] Henrik Voigt, Özge Alacam, Monique Meuschke, Kai Lawonn and Sina Zarrieß, The Why and The How: A Survey on Natural Language Interaction in Visualization

[12] Jiayi Zhang, Simon Yu, Derek Chong, Anthony Sicilia, Michael R. Tomz, Christopher D. Manning, Weiyan Shi, Verbalized Sampling: The best way to Mitigate Mode Collapse and Unlock LLM Diversity

[13] Saadiq Rauf Khan, Vinit Chandak, Sougata Mukherjea, Evaluating LLMs for Visualization Generation and Understanding

[14] Paula Maddigan, Teo Susnjak, Chat2VIS: Generating Data Visualizations via Natural Language Using ChatGPT, Codex and GPT-3 Large Language Models – SciSpace

[15] Best 6 Tools for Conversational AI Analytics

[16] What are the challenges and limitations of natural language processing? – Tencent Cloud

[17] Arjun Srinivasan, John Stasko, Natural Language Interfaces for Data Evaluation with Visualization: Considering What Has and Could Be Asked

[18]

[19] Fabi.ai, Addressing the constraints of traditional BI tools for complex analyses

[20] Sarfraz Nawaz, Why Conversational AI Agents Will Replace BI Dashboards in 2025

[*] Star Trek analogy was generated in ChatGPT, may not accurately reflect the characters’ actions within the series. I haven’t watched it for roughly 30 years 😉 .


Disclaimer

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x