Real-Time Intelligence in Microsoft Fabric: The Ultimate Guide

, handling streaming data was considered an approach. Because the introduction of relational database management systems within the Seventies and traditional data warehousing systems within the late Eighties, all data workloads began and ended with the so-called . Batch processing relies on the concept of collecting quite a few tasks in a bunch (or batch) and processing these tasks in a single operation.

On the flip side, there’s an idea of . Although streaming data remains to be sometimes considered a cutting-edge technology, it already has a solid history. Every little thing began in 2002, when Stanford University researchers published the paper called “Models and Issues in Data Stream Systems”. Nonetheless, it wasn’t until almost a decade later (2011) that streaming data systems began to achieve a wider audience, when the Apache Kafka platform for storing and processing streaming data was open-sourced. The remainder is history, as people say. Nowadays, processing streaming data will not be considered a luxury but a necessity.

Microsoft recognized the growing have to process the information “”. Hence, Microsoft Fabric doesn’t disappoint in that regard, as Real-time Intelligence is on the core of the complete platform and offers an entire range of capabilities to handle streaming data efficiently.

Before we dive deep into explaining each component of Real-time Intelligence, let’s take one step back and take a more tool-agnostic approach to stream processing typically.

When you enter the phrase from the section title in Google Search, you’ll get greater than 100,000 results! Subsequently, I’m sharing an illustration that represents understanding of stream processing.

Illustration by writer

Let’s now examine typical use cases for stream processing:

Fraud detection
Real-time stock trades
Customer activity
Log monitoring — troubleshooting systems, devices, etc.
Security information and event management — analyzing logs and real-time event data for monitoring and threat detection
Warehouse inventory
Ride share matching
Machine learning and predictive analytics

As you could have noticed, streaming data has turn out to be an integral part of various real-life scenarios and is taken into account vastly superior to traditional batch processing for the aforementioned use cases.

Let’s now explore how streaming data processing is performed in Microsoft Fabric and which tools of trade we have now at our disposal.

The next illustration shows the high-level overview of all Real-time Intelligence components in Microsoft Fabric:

Real-Time hub

Let’s kick it off by introducing a Real-Time hub. Every Microsoft Fabric tenant routinely provisions a Real-Time hub. This can be a point of interest for all across the complete organization. Just like OneLake, there could be one, and just one, Real-Time hub per tenant — this implies, you possibly can’t provision or create multiple Real-Time hubs.

The primary purpose of the Real-Time hub is to enable quick and straightforward discovery, ingestion, management, and consumption of streaming data from a big selection of sources. In the next illustration, you could find the overview of all the information streams within the Real-Time hub in Microsoft Fabric:

Let’s now explore all of the available options within the Real-Time hub.

All data streams tab displays all of the streams and tables you possibly can access. Streams represent the output from Fabric eventstreams, whereas tables come from KQL databases. We’ll explore each evenstreams and KQL databases in additional detail in the next sections
My data streams tab shows all of the streams you brought into Microsoft Fabric into My workspace
Data sources tab is on the core of bringing the information into Fabric, each from inside and outdoors. Once you end up within the Data sources tab, you possibly can choose from quite a few, out-of-the-box provided connectors, comparable to Kafka, CDC streams for various database systems, external cloud solutions like AWS and GCP, and lots of more
Microsoft sources tab filters out the previous set of sources to incorporate Microsoft data sources only
Fabric events tab displays the list of system events generated in Microsoft Fabric which you can access. Here, it’s possible you’ll choose from Job events, OneLake events, and Workspace item events. Let’s dive into each of those three options:
- Job events are events produced by status changes on Fabric monitor activities, comparable to job created, succeeded, or failed
- OneLake events represent events produced by actions on files and folders in OneLake, comparable to file created, deleted, or renamed
- Workspace item events are produced by actions on workspace items, comparable to item created, deleted, or renamed
Azure events tab shows the list of system events generated in Azure blob storage

The Real-Time hub provides various connectors for ingesting the information into Microsoft Fabric. It also enables creating streams for the entire supported sources. After the stream is created, you possibly can process, analyze, and act on them.

Processing a stream lets you apply quite a few transformations, comparable to aggregate, filter, union, and lots of more. The goal is to remodel the information before you send the output to supported destinations
Analyzing a stream allows you to add a KQL database as a destination of the stream, after which open the KQL Database and execute queries against the database.
Acting on streams assumes setting the alerts based on conditions and specifying actions to be taken when certain conditions are met

Eventstreams

When you’re a low-code or no-code data skilled and you want to handle streaming data, you’ll love Eventstreams. In a nutshell, Eventstream lets you connect with quite a few data sources, which we examined within the previous section, optionally apply various data transformation steps, and eventually output results into a number of destinations. The next figure illustrates a typical workflow for ingesting streaming data into three different destinations — Eventhouse, Lakehouse, and Activator:

Throughout the Eventstream settings, you possibly can adjust the retention period for the incoming data. By default, the information is retained for in the future, and events are routinely removed when the retention period expires.

Other than that, it’s possible you’ll also need to fine-tune the event throughput for incoming and outgoing events. There are three options to select from:

Low: < 10 MB/s
Medium: 10-100 MB/s
High: > 100 MB/s

Eventhouse and KQL database

Within the previous section, you’ve learned easy methods to connect with various streaming data sources, optionally transform the information, and eventually load it into the ultimate destination. As you may have noticed, considered one of the available destinations is the Eventhouse. On this section, we’ll explore Microsoft Fabric items used to store the information throughout the Real-Time Intelligence workload.

Eventhouse

We’ll first introduce the Eventhouse item. The Eventhouse is nothing else but a container for KQL databases. Eventhouse itself doesn’t store any data — it simply provides the infrastructure throughout the Fabric workspace for coping with streaming data. The next figure displays the System overview page of the Eventhouse:

The beauty of the System overview page is that it provides all the important thing information at a look. Hence, you possibly can immediately understand the running state of the eventhouse, OneLake storage usage, further broken down per individual KQL database level, compute usage, most energetic databases and users, and up to date events.

If we switch to the Databases page, we are going to have the ability to see a high-level overview of KQL databases which can be a part of the prevailing Eventhouse, as shown below:

You’ll be able to create multiple eventhouses in a single Fabric workspace. Also, a single eventhouse may contain a number of KQL databases:

Let’s wrap up the story in regards to the Eventhouse by explaining the concept of . By design, the Eventhouse is optimized to auto-suspend services when not in use. Subsequently, when these services are reactivated, it would take a while for the Eventhouse to be fully available again. Nonetheless, there are specific business scenarios when this latency will not be acceptable. In those scenarios, make sure that to configure the Minimum consumption feature. By configuring the Minimum consumption, the service is all the time available, but you might be accountable for determining the minimum level, which is then available for KQL databases contained in the Eventhouse.

KQL database

Now that you just’ve learned in regards to the Eventhouse container, let’s give attention to examining the core item for storing real-time analytics data — the KQL database.

Let’s take one step back and explain the name of the item first. While most data professionals have about SQL (which stands for Structured Query Language), I’m quite confident that KQL is far more cryptic than its “structured” relative.

You may have rightly assumed that QL within the abbreviation stands for Query Language. But, what does this letter K represent? It’s an abbreviation for . I hear you, I hear you: what’s now Kusto?! Although the urban legend says that the language was named after the famous polymath and oceanographer Jacques Cousteau (his last name is pronounced “”), I couldn’t find any official confirmation from Microsoft to verify this story. What is unquestionably known is that it was the interior project name for the Log Analytics Query Language.

Once we discuss history, let’s share some more history lessons. When you ever worked with Azure Data Explorer (ADX) up to now, you might be in luck. KQL database in Microsoft Fabric is the official successor of ADX. Just like many other Azure data services that were rebuilt and integrated into SaaS-fied nature of Fabric, ADX provided platform for storing and querying real-time analytics data for KQL databases. The engine and core capabilities of the KQL database are the identical as in Azure Data Explorer — the important thing difference is the management behavior: Azure Data Explorer represents a PaaS (Platform-as-a-Service), whereas KQL database is a SaaS (Software-as-a-Service) solution.

Although it’s possible you’ll store any data within the KQL database (non-structured, semi-structured, and structured), its primary purpose is handling telemetry, logs, events, traces, and time series data. Under the hood, the engine leverages optimized storage formats, automatic indexing and partitioning, and advanced data statistics for efficient query planning.

Let’s now examine easy methods to leverage the KQL database in Microsoft Fabric to store and query real-time analytics data. Making a database is as straightforward because it could possibly be. The next figure illustrates the 2-step strategy of making a KQL database in Fabric:

Click on the “+” sign next to KQL databases
Provide the database name and select its type. Type could be the default latest database, or a shortcut database. Shortcut database is a reference to a distinct database that could be either one other KQL database in Real-Time Intelligence in Microsoft Fabric, or an Azure Data Explorer database

Let’s now take a fast tour of the important thing features of the KQL database from the user-interface perspective. The figure below illustrates the primary points of interest:

Tables – displays all of the tables within the database
Shortcuts – shows tables created as OneLake shortcuts
Materialized views – a materialized view represents the aggregation query over a source table or one other materialized view. It consists of a single summarize statement
Functions – these are User-defined functions stored and managed on a database level, just like tables. These functions are created by utilizing the .create function command
Data streams – all streams which can be relevant for the chosen KQL database
Data Activity Tracker – shows the activity within the database for the chosen time period
Tables/Data preview – enables switching between two different views. Tables displays the high-level overview of the database tables, whereas Data preview shows the highest 100 records of the chosen table

Query and visualize data in Real-Time Intelligence

Now that you just’ve learned easy methods to store real-time analytics data in Microsoft Fabric, it’s time to get our hands dirty and supply some business insight out of this data. On this section, I’ll give attention to explaining various options for extracting useful information from the information stored within the KQL database.

Hence, on this section, I’ll introduce common KQL functions for data retrieval, and explore Real-time dashboards for visualizing the information.

KQL queryset

The KQL queryset is the material item used to run queries and think about and customize results from various data sources. As soon as you create a brand new KQL database, the KQL queryset item can be provisioned out of the box. This can be a default KQL queryset that’s routinely connected to the KQL database under which it exists. The default KQL queryset doesn’t allow multiple connections.

On the flip side, whenever you create a custom KQL queryset item, you possibly can connect it to multiple data sources, as shown in the next illustration:

Let’s now introduce the constructing blocks of the KQL and examine a number of the mostly used operators and functions. KQL is a reasonably easy yet powerful language. To some extent, it’s very just like SQL, especially when it comes to using schema entities which can be organized in hierarchies, comparable to databases, tables, and columns.

Probably the most common kind of KQL query statement is a statement. Which means that each query input and output consist of tables or tabular datasets. Operators in a tabular statement are sequenced by the “|” (pipe) symbol. Data is flowing (is piped) from one operator to the subsequent, as displayed in the next code snippet:

MyTable

| where StartTime between (datetime(2024-11-01) .. datetime(2024-12-01))

| where State == "Texas"  

| count

Within the above code example, the information in MyTable is first filtered on the StartTime column, then filtered on the State column, and eventually, the query returns a table containing a single column and single row, displaying the count of the filtered rows.

The fair query at this point can be: what if I already know SQL? Do I would like to learn one other language only for the sake of querying real-time analytics data? The reply is as usual:

Luckily, I actually have good and great news to share here!

The excellent news is: you CAN write SQL statements to question the information stored within the KQL database. But, the proven fact that you do something, doesn’t mean you …Through the use of SQL-only queries, you might be missing the purpose, and limitting yourself from using many KQL-specific functions which can be built to address real-time analytics queries in essentially the most efficient way

The nice news is: by leveraging the operator, you possibly can “ask” Kusto to translate your SQL statement into an equivalent KQL statement, as displayed in the next figure:

In the next examples, we are going to query the sample Weather dataset, which incorporates data about weather storms and damages within the USA. Let’s start easy after which introduce some more complex queries. In the primary example, we are going to count the variety of records within the Weather table:

//Count records
Weather
| count

Wondering easy methods to retrieve only a subset of records? You need to use either or operator:

//Sample data
Weather
| take 10

Please be mindful that the operator won’t return the TOP n variety of records, unless your data is sorted in the precise order. Normally, the take operator returns any n variety of records from the table.

In the subsequent step, we would like to increase this question and return not only a subset of rows, but in addition a subset of columns:

//Sample data from a subset of columns
Weather
| take 10
| project State, EventType, DamageProperty

The operator is the equivalent of the SELECT statement in SQL. It specifies which columns must be included within the result set.

In the next example, we’re making a calculated column, Duration, that represents a duration between EndTime and StartTime values. As well as, we would like to display only top 10 records sorted by the DamageProperty value in descending order:

//Create calculated columns
Weather
| where State == 'NEW YORK' and EventType == 'Winter Weather'
| top 10 by DamageProperty desc
| project StartTime, EndTime, Duration = EndTime - StartTime, DamageProperty

It’s the precise moment to introduce the operator. This operator produces a table that aggregates the content of the input table. Hence, the next statement will display the entire variety of records per each state, including only the highest 5 states:

//Use summarize operator
Weather
| summarize TotalRecords = count() by State
| top 5 by TotalRecords

Let’s expand on the previous code and visualize the information directly within the result set. I’ll add one other line of KQL code to render results as a bar chart:

As it’s possible you’ll notice, the chart could be moreover customized from the Visual formatting pane on the right-hand side, which provides much more flexibility when visualizing the information stored within the KQL database.

These were just basic examples of using KQL language to retrieve the information stored within the Eventhouse and KQL databases. I can assure you that KQL won’t allow you to down in additional advanced use cases when you want to manipulate and retrieve real-time analytics data.

I understand that SQL is the “” of many data professionals. And although you possibly can write SQL to retrieve the information from the KQL database, I strongly encourage you to refrain from doing this. As a fast reference, I’m providing you with a “SQL to KQL cheat sheet” to present you a head start when transitioning from SQL to KQL.

Also, my friend and fellow MVP Brian Bønk published and maintains a implausible reference guide for the KQL language here. Be certain that to present it a try for those who are working with KQL.

Real-time dashboards

While KQL querysets represent a robust way of exploring and querying data stored in Eventhouses and KQL databases, their visualization capabilities are pretty limited. Yes, you visualize ends in the query view, as you’ve seen in considered one of the previous examples, but that is more of a “first aid” visualization that won’t make your managers and business decision-makers joyful.

Fortunately, there’s an out-of-the-box solution in Real-Time Intelligence that supports advanced data visualization concepts and features. Real-Time Dashboard is a Fabric item that allows the creation of interactive and visually appealing business-reporting solutions.

Let’s first discover the core elements of the Real-Time Dashboard. A dashboard consists of a number of tiles, optionally structured and arranged in pages, where each tile is populated by the underlying KQL query.

As a primary step within the strategy of creating Real-Time Dashboards, this setting should be enabled within the Admin portal of your Fabric tenant:

Next, it’s best to create a brand new Real-Time Dashboard item within the Fabric workspace. From there, let’s connect with our Weather dataset and configure our first dashboard tile. We’ll execute considered one of the queries from the previous section to retrieve the highest 10 states with the conditional count function. The figure below shows the tile settings panel with quite a few options to configure:

KQL query to populate the tile
Visual representation of the information
Visual formatting pane with options to set the tile name and outline
Visual type drop-down menu to pick the specified visual type (in our case, it’s table visual)

Let’s now add two more tiles to our dashboard. I’ll copy and paste two queries that we previously used — the primary will retrieve the highest 5 states per total variety of records, whereas the opposite will display the damage property value change over time for the state of Latest York and for event type, which equals winter weather.

You may as well add a tile directly from the KQL queryset to the prevailing dashboard, as illustrated below:

Let’s now give attention to the assorted capabilities you’ve when working with Real-Time Dashboards. In the highest ribbon, you’ll find options so as to add a Latest data source, set a brand new parameter, and add base queries. Nonetheless, what really makes Real-Time Dashboards powerful is the chance to set alerts on a Real-Time Dashboard. Depending if the conditions defined within the alert are met, you possibly can trigger a selected motion, comparable to sending an email or Microsoft Teams message. An alert is created using the Activator item.

Visualize data with Power BI

Power BI is a mature and widely adopted tool for constructing robust, scalable, and interactive business reporting solutions. On this section, we specifically give attention to examining how Power BI works in synergy with the Real-Time Intelligence workload in Microsoft Fabric.

Making a Power BI report based on the information stored within the KQL database couldn’t be easier. You’ll be able to decide to create a Power BI report directly from the KQL queryset, as displayed below:

Each query within the KQL queryset represents a table within the Power BI semantic model. From here, you possibly can construct visualizations and leverage all the prevailing Power BI features to design an efficient, visually appealing report.

Obviously, you possibly can still leverage the “regular” Power BI workflow, which assumes connecting from the Power BI Desktop to a KQL database as a knowledge source. On this case, you want to open a OneLake data hub and choose KQL Databases as a knowledge source:

The identical as for SQL-based data sources, you possibly can choose from the Import and DirectQuery storage modes in your real-time analytics data. Import mode creates an area copy of the information in Power BI’s database, whereas DirectQuery enables querying the KQL database in near-real-time.

Activator

Activator is one of the vital modern features in the complete Microsoft Fabric realm. I’ll cover Activator intimately in a separate article. Here, I just need to introduce this service and briefly emphasize its primary characteristics.

Activator is a no-code solution for routinely taking actions when conditions within the underlying data are met. Activator could be used along side Eventstreams, Real-Time Dashboards, and Power BI reports. Once the information hits a certain threshold, the Activator routinely triggers the required motion — for instance, sending the e-mail or Microsoft Teams message, and even firing Power Automate flows. I’ll cover all these scenarios in additional depth in a separate article, where I also provide some practical scenarios for implementing the Activator item.

Conclusion

Real-Time Intelligence — something that began as a component of the “Synapse experience” in Microsoft Fabric, is now a separate, dedicated workload. That tells us so much about Microsoft’s vision and roadmap for Real-Time Intelligence!

Don’t forget: initially, Real-Time Analytics was included under the Synapse umbrella, along with Data Engineering, Data Warehousing, and Data Science experiences. Nonetheless, Microsoft thought that handling streaming data deserves a dedicated workload in Microsoft Fabric, which absolutely is sensible considering the growing have to take care of and supply insight from this data as soon because it is captured. In that sense, Microsoft Fabric provides an entire suite of powerful services, as the subsequent generation of tools for processing, analyzing, and acting on data because it’s generated.

I’m quite confident that the Real-Time Intelligence workload will turn out to be increasingly significant in the longer term, considering the evolution of knowledge sources and the increasing pace of knowledge generation.

Thanks for reading!

Real-Time Intelligence in Microsoft Fabric: The Ultimate Guide

Real-Time hub

Eventstreams

Eventhouse and KQL database

Eventhouse

KQL database

Query and visualize data in Real-Time Intelligence

KQL queryset

Real-time dashboards

Visualize data with Power BI

Activator

Conclusion

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

Stop Worrying about AGI: The Immediate Danger is Reduced General Intelligence (RGI)

I Built an IOS App in 3 Days with Literally No Prior Swift Knowledge

Critical Mistakes Corporations Make When Integrating AI/ML into Their Processes

I Measured Neural Network Training Every 5 Steps for 10,000 Iterations

The best way to Automate Workflows with AI

Real-Time Intelligence in Microsoft Fabric: The Ultimate Guide

Real-Time hub

Eventstreams

Eventhouse and KQL database

Eventhouse

KQL database

Query and visualize data in Real-Time Intelligence

KQL queryset

Real-time dashboards

Visualize data with Power BI

Activator

Conclusion

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.