Dr. Mike Flaxman, VP or Product Management at HEAVY.AI – Interview Series

Dr. Mike Flaxman is currently the VP of Product at HEAVY.AI, having previously served as Product Manager and led the Spatial Data Science practice in Skilled Services. He has spent the last 20 years working in spatial environmental planning. Prior to HEAVY.AI, he founded Geodesign Technolgoies, Inc and cofounded GeoAdaptive LLC, two startups applying spatial evaluation technologies to planning. Before startup life, he was a professor of planning at MIT and Industry Manager at ESRI.

HEAVY.AI is a hardware-accelerated platform for real-time, high-impact data analytics. It leverages each GPU and CPU processing to question massive datasets quickly, with support for SQL and geospatial data. The platform includes visual analytics tools for interactive dashboards, cross-filtering, and scalable data visualizations, enabling efficient big data evaluation across various industries.

Are you able to tell us about your skilled background and what led you to affix HEAVY.AI?

Before joining HEAVY.AI, I spent years in academia, ultimately teaching spatial analytics at MIT. I also ran a small consulting firm, with quite a lot of public sector clients. I’ve been involved in GIS projects across 17 countries. My work has taken me from advising organizations just like the Inter American Development Bank to managing GIS technology for architecture, engineering and construction at ESRI, the world’s largest GIS developer

I remember vividly my first encounter with what’s now HEAVY.AI, which was when as a consultant I used to be liable for scenario planning for the Florida Beaches Habitat Conservation Program. My colleagues and I were struggling to model sea turtle habitat using 30m Landsat data and a friend pointed me to some brand latest and really relevant data – 5cm LiDAR. It was exactly what we wanted scientifically, but something like 3600 times larger than what we’d planned to make use of. Unnecessary to say, nobody was going to extend my budget by even a fraction of that quantity. In order that day I put down the tools I’d been using and teaching for several many years and went on the lookout for something latest. HEAVY.AI sliced through and rendered that data so easily and effortlessly that I used to be immediately hooked.

Fast forward just a few years, and I still think what HEAVY.AI does is pretty unique and its early bet on GPU-analytics was exactly where the industry still must go. HEAVY.AI is firmly focussed on democratizing access to big data. This has the information volume and processing speed component in fact, essentially giving everyone their very own supercomputer. But an increasingly essential aspect with the appearance of enormous language models is in making spatial modeling accessible to many more people. As of late, moderately than spending years learning a posh interface with hundreds of tools, you possibly can just start a conversation with HEAVY.AI within the human language of your selection. This system not only generates the commands required, but in addition presents relevant visualizations.

Behind the scenes, delivering ease of use is in fact very difficult. Currently, because the VP of Product Management at HEAVY.AI, I’m heavily involved in determining which features and capabilities we prioritize for our products. My extensive background in GIS allows me to actually understand the needs of our customers and guide our development roadmap accordingly.

How has your previous experience in spatial environmental planning and startups influenced your work at HEAVY.AI?

Environmental planning is a very difficult domain in that it’s good to account for each different sets of human needs and the natural world. The final solution I learned early was to pair a way referred to as participatory planning, with the technologies of distant sensing and GIS. Before selecting a plan of motion, we’d make multiple scenarios and simulate their positive and negative impacts in the pc using visualizations. Using participatory processes allow us to mix various forms of experience and solve very complex problems.

While we don’t typically do environmental planning at HEAVY.AI, this pattern still works thoroughly in business settings. So we help customers construct digital twins of key parts of their business, and we allow them to create and evaluate business scenarios quickly.

I suppose my teaching experience has given me deep empathy for software users, particularly of complex software systems. Where one student stumbles in a single spot is random, but where dozens or a whole lot of individuals make similar errors, you’ve got a design issue. Perhaps my favorite a part of software design is taking these learnings and applying them in designing latest generations of systems.

Are you able to explain how HeavyIQ leverages natural language processing to facilitate data exploration and visualization?

As of late it seems everyone and their brother is touting a brand new genAI model, most of them forgettable clones of one another. We’ve taken a really different path. We imagine that accuracy, reproducibility and privacy are essential characteristics for any business analytics tools, including those generated with large language models (LLMs). So we now have built those into our offering at a fundamental level. For instance, we constrain model inputs strictly to enterprise databases and to supply documents inside an enterprise security perimeter. We also constrain outputs to the newest HeavySQL and Charts. That implies that whatever query you ask, we’ll try to reply together with your data, and we’ll show you exactly how we derived that answer.

With those guarantees in place, it matters less to our customers exactly how we process the queries. But behind the scenes, one other essential difference relative to consumer genAI is that we wonderful tune models extensively against the precise forms of questions business users ask of business data, including spatial data. So for instance our model is superb at performing spatial and time series joins, which aren’t in classical SQL benchmarks but our users use every day.

We package these core capabilities right into a Notebook interface we call HeavyIQ. IQ is about making data exploration and visualization as intuitive as possible through the use of natural language processing (NLP). You ask a matter in English—like, “What were the weather patterns in California last week?”—and HeavyIQ translates that into SQL queries that our GPU-accelerated database processes quickly. The outcomes are presented not only as data but as visualizations—maps, charts, whatever’s most relevant. It’s about enabling fast, interactive querying, especially when coping with large or fast-moving datasets. What’s key here is that it’s often not the primary query you ask, but perhaps the third, that basically gets to the core insight, and HeavyIQ is designed to facilitate that deeper exploration.

What are the first advantages of using HeavyIQ over traditional BI tools for telcos, utilities, and government agencies?

HeavyIQ excels in environments where you are coping with large-scale, high-velocity data—exactly the sort of information telcos, utilities, and government agencies handle. Traditional business intelligence tools often struggle with the quantity and speed of this data. As an illustration, in telecommunications, you would possibly have billions of call records, nevertheless it’s the tiny fraction of dropped calls that it’s good to give attention to. HeavyIQ lets you sift through that data 10 to 100 times faster because of our GPU infrastructure. This speed, combined with the power to interactively query and visualize data, makes it invaluable for risk analytics in utilities or real-time scenario planning for presidency agencies.

The opposite advantage already alluded to above, is that spatial and temporal SQL queries are extremely powerful analytically – but might be slow or difficult to put in writing by hand. When a system operates at what we call “the speed of curiosity” users can ask each more questions and more nuanced questions. So for instance a telco engineer might notice a temporal spike in equipment failures from a monitoring system, have the intuition that something goes unsuitable at a selected facility, and check this with a spatial query returning a map.

What measures are in place to forestall metadata leakage when using HeavyIQ?

As described above, we’ve built HeavyIQ with privacy and security at its core. This includes not only data but in addition several sorts of metadata. We use column and table-level metadata extensively in determining which tables and columns contain the knowledge needed to reply a question. We also use internal company documents where provided to help in what’s referred to as retrieval-augmented generation (RAG). Lastly, the language models themselves generate further metadata. All of those, but especially the latter two might be of high business sensitivity.

Unlike third-party models where your data is usually sent off to external servers, HeavyIQ runs locally on the identical GPU infrastructure as the remaining of our platform. This ensures that your data and metadata remain under your control, with no risk of leakage. For organizations that require the very best levels of security, HeavyIQ may even be deployed in a very air-gapped environment, ensuring that sensitive information never leaves specific equipment.

How does HEAVY.AI achieve high performance and scalability with massive datasets using GPU infrastructure?

The key sauce is actually in avoiding the information movement prevalent in other systems. At its core, this starts with a purpose-built database that is designed from the bottom as much as run on NVIDIA GPUs. We have been working on this for over 10 years now, and we truly imagine we now have the best-in-class solution on the subject of GPU-accelerated analytics.

Even the perfect CPU-based systems run out of steam well before a middling GPU. The strategy once this happens on CPU requires distributing data across multiple cores after which multiple systems (so-called ‘horizontal scaling’). This works well in some contexts where things are less time-critical, but generally starts getting bottlenecked on network performance.

Along with avoiding all of this data movement on queries, we also avoid it on many other common tasks. The primary is that we are able to render graphics without moving the information. Then in the event you want ML inference modeling, we again try this without data movement. And in the event you interrogate the information with a big language model, we yet again do that without data movement. Even in the event you are a knowledge scientist and wish to interrogate the information from Python, we again provide methods to do that on GPU without data movement.

What which means in practice is that we are able to perform not only queries but in addition rendering 10 to 100 times faster than traditional CPU-based databases and map servers. Whenever you’re coping with the large, high-velocity datasets that our customers work with – things like weather models, telecom call records, or satellite imagery – that type of performance boost is totally essential.

How does HEAVY.AI maintain its competitive edge within the fast-evolving landscape of huge data analytics and AI?

That is an important query, and it’s something we take into consideration continuously. The landscape of huge data analytics and AI is evolving at an incredibly rapid pace, with latest breakthroughs and innovations happening on a regular basis. It definitely doesn’t hurt that we now have a ten yr headstart on GPU database technology. .

I believe the important thing for us is to remain laser-focused on our core mission – democratizing access to big, geospatial data. Meaning continually pushing the boundaries of what is possible with GPU-accelerated analytics, and ensuring our products deliver unparalleled performance and capabilities on this domain. An enormous a part of that’s our ongoing investment in developing custom, fine-tuned language models that really understand the nuances of spatial SQL and geospatial evaluation.

We have built up an in depth library of coaching data, going well beyond generic benchmarks, to make sure our conversational analytics tools can engage with users in a natural, intuitive way. But we also know that technology alone is not enough. We’ve to remain deeply connected to our customers and their evolving needs. At the top of the day, our competitive edge comes right down to our relentless give attention to delivering transformative value to our users. We’re not only keeping pace with the market – we’re pushing the boundaries of what is possible with big data and AI. And we’ll proceed to achieve this, irrespective of how quickly the landscape evolves.

How does HEAVY.AI support emergency response efforts through HeavyEco?

We built HeavyEco once we saw a few of our largest utility customers having significant challenges simply ingesting today’s weather model outputs, in addition to visualizing them for joint comparisons. It was taking one customer as much as 4 hours simply to load data, and if you find yourself up against fast-moving extreme weather conditions like fires…that’s just not ok.

HeavyEco is designed to supply real-time insights in high-consequence situations, like during a wildfire or flood. In such scenarios, it’s good to make decisions quickly and based on the perfect possible data. So HeavyEco serves firstly as a professionally-managed data pipeline for authoritative models akin to those from NOAA and USGS. On top of those, HeavyEco lets you run scenarios, model building-level impacts, and visualize data in real time. This provides first responders the critical information they need when it matters most. It’s about turning complex, large-scale datasets into actionable intelligence that may guide immediate decision-making.

Ultimately, our goal is to provide our users the power to explore their data on the speed of thought. Whether or not they’re running complex spatial models, comparing weather forecasts, or attempting to discover patterns in geospatial time series, we would like them to give you the chance to do it seamlessly, with none technical barriers getting of their way.

What distinguishes HEAVY.AI’s proprietary LLM from other third-party LLMs when it comes to accuracy and performance?

Our proprietary LLM is specifically tuned for the forms of analytics we give attention to—like text-to-SQL and text-to-visualization. We initially tried traditional third-party models, but found they didn’t meet the high accuracy requirements of our users, who are sometimes making critical decisions. So, we fine-tuned a spread of open-source models and tested them against industry benchmarks.

Our LLM is rather more accurate for the advanced SQL concepts our users need, particularly in geospatial and temporal data. Moreover, since it runs on our GPU infrastructure, it’s also safer.

Along with the built-in model capabilities, we also provide a full interactive user interface for administrators and users so as to add domain or business-relevant metadata. For instance, if the bottom model doesn’t perform as expected, you possibly can import or tweak column-level metadata, or add guidance information and immediately get feedback.

How does HEAVY.AI envision the role of geospatial and temporal data analytics in shaping the longer term of varied industries?

We imagine geospatial and temporal data analytics are going to be critical for the longer term of many industries. What we’re really focused on helps our customers make higher decisions, faster. Whether you are in telecom, utilities, or government, or other – having the power to research and visualize data in real-time is usually a game-changer.

Our mission is to make this sort of powerful analytics accessible to everyone, not only the large players with massive resources. We would like to make sure that our customers can make the most of the information they’ve, to remain ahead and solve problems as they arise. As data continues to grow and develop into more complex, we see our role as ensuring our tools evolve right alongside it, so our customers are all the time prepared for what’s next.

Dr. Mike Flaxman, VP or Product Management at HEAVY.AI – Interview Series

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

Python Can Now Call Mojo

Data Visualization Explained: What It Is and Why It Matters

Methods to Select the 5 Most Relevant Documents for AI Search

The SyncNet Research Paper, Clearly Explained

Constructing LLM Apps That Can See, Think, and Integrate: Using o3 with Multimodal Input and Structured Output

Dr. Mike Flaxman, VP or Product Management at HEAVY.AI – Interview Series

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.