We all the time use filters when developing DAX expressions, reminiscent of DAX measures, or when writing DAX queries.
But what happens exactly after we apply filters?
This piece is strictly about this query.
I’ll start with easy queries and add variants to explore what happens under the hood.
I take advantage of DAX Studio and the choice to point out server timings for every query.
In case you would like to learn more about this feature and easy methods to interpret the outcomes, read the primary article within the References section at the tip of this piece.
Let’s start with the bottom query:
EVALUATE
CALCULATETABLE(
SUMMARIZECOLUMNS('Product'[BrandName]
,"Online Sales", [Sum Online Sales]
)
)
Once we activate the server timings and execute the query, we get the execution statistics and the Storage Engines (SE) query/queries needed to get the info:

As you possibly can see, we want just one Storage Engine (SE) query to retrieve the outcomes.
The query completes in just 47 ms and is served almost entirely by the SE (95.7%).
The more time the SE can spend on a question, the higher, since it is the component that retrieves data from the info stores and tables.
Furthermore, the SE can use multiple CPU cores, whereas the Formula Engine (FE) can use just one. We cannot examine exactly what happens within the FE as easily as we will with SE queries.
You may learn more concerning the difference between these two engines within the article mentioned above.
A brief note:
A couple of months ago, I wrote an article here with a really similar title. But, while that one was only about date filters with Time Intelligence functions, this one goes one step deeper into the rabbit hole.
That is far more generic than that one.
For those who missed it, I added the article link and extra resources on the present topic to the References section below.
Add easy filters
Next, we add a straightforward filter for the product color red to the query:
EVALUATE
CALCULATETABLE(
SUMMARIZECOLUMNS('Product'[BrandName]
,"Online Sales", [Sum Online Sales]
)
,'Product'[ColorName] = "Red"
)
Here is the query and the outcomes restricted to the product color red:

Once we take a look at the query statistics, we see this:

As you possibly can see, the whole query is executed in a single SE query.
The filter is within the query’s WHERE clause. Subsequently, only the restricted data is retrieved.
That is visible within the “Rows” column, as only 14 rows are returned from this question.
But what happens after we use the FILTER() function to filter the products:
EVALUATE
CALCULATETABLE(
SUMMARIZECOLUMNS('Product'[BrandName]
,"Online Sales", [Sum Online Sales]
)
,FILTER('Product'
,'Product'[ColorName] = "Red")
)
As you would possibly know, using the FILTER() function just isn’t advisable attributable to how it really works.
You may learn more about this topic within the second article linked within the References section below.
The result doesn’t change:

But how does it affect the execution plan and the SE queries?

As you possibly can see, on this case, the SE optimizes the query, yielding the identical execution plan as before.
But, as we alter our code, we are going to see that using FILTER() isn’t all the time an excellent idea.
Add multiple filters
Now, what happens after we add multiple filters to a question?
EVALUATE
CALCULATETABLE(
SUMMARIZECOLUMNS('Product'[BrandName]
,"Online Sales", [Sum Online Sales]
)
,'Product'[ColorName] = "Red"
,'Geography'[ContinentName] = "Europe"
)
While the result just isn’t that interesting to us, let’s take a look at the query statistics:

Again, the query will be served by a single SE query that accommodates each filters.
The query executes so quickly that the FE time percentage is comparatively high, yet it still only takes 6ms.
When changing the query to make use of the FILTER() function, the SE query doesn’t change either:

This shows that, with this sort of query, the engine can optimize execution to search out probably the most efficient solution to fulfill the DAX query.
Anyway, the result doesn’t change. It’s similar in each cases, appropriately, because we don’t change the filter per se. But please be patient with me; I’m getting back to the FILTER() function and why it’s essential to grasp its effects in a moment.
Moving filters into measures
Next, let’s see what happens when the filter is moved into the measure.
Until now, the query was built in order that the measure [Sum Online Sales] received its filter from outside.
Let’s do that:
DEFINE
MEASURE 'All Measures'[Online Sales A. Datum] =
CALCULATE(
SUMX('Online Sales', ( 'Online Sales'[UnitPrice] * 'Online Sales'[SalesQuantity]) - 'Online Sales'[DiscountAmount] )
,'Product'[BrandName] = "A. Datum"
)
EVALUATE
CALCULATETABLE(
SUMMARIZECOLUMNS('Product'[BrandName]
,"Online Sales A. Datum", [Online Sales A. Datum]
)
)
As you possibly can see, the filter is applied contained in the measure [Online Sales A. Datum].
After all, the resulting number is similar in each row of the result, because the Brand is about as “A. Datum”:

However the execution is barely different:

This time, we’ve got two SE queries.
- The query to get the sales for the Brand “A. Datum”. This question accommodates the filter for that brand.
- The second query is used to get the list for all brands within the result set.
The primary query is most vital to us, since it still shows the filter for the brand set inside the measure.
This question will be fully served by the SE with a straightforward filter in a really efficient way.
But, typically, we wish so as to add multiple measures to a question (or a visible in a report).
What happens after we add the [Sum Online Sales] measure to the query?
The result just isn’t particularly essential, because it shows one column with sales for every brand and one other with sales for the filtered brand.
However the query statistics are interesting:

As you possibly can see within the red-marked line within the SE query, the Brand filter isn’t any longer present.
Since the engine recognizes that the filter within the measure is applied to the identical column because the one within the query, it moves the filter to the FE and returns the result.
Now, what happens after we filter one other column within the measure, for instance, the colour:
DEFINE
MEASURE 'All Measures'[Online Sales Red] =
CALCULATE(
SUMX('Online Sales', ( 'Online Sales'[UnitPrice] * 'Online Sales'[SalesQuantity]) - 'Online Sales'[DiscountAmount] )
,'Product'[ColorName] = "Red"
)
EVALUATE
CALCULATETABLE(
SUMMARIZECOLUMNS('Product'[BrandName]
,"Online Sales", [Sum Online Sales]
,"Online Sales Red", [Online Sales Red]
)
)
Again, the result just isn’t particularly interesting. We’re excited about the query statistics:

As you possibly can see, this time we’ve got two queries by BrandName. One without and one with the filter for the colour.
Each queries return the identical variety of rows (14) – one for every Brand.
The FE handles combining the 2 results right into a single table.
The whole query continues to be served mainly by the SE, which is superb.
But now, let’s add the FILTER() function to the Filter:
For this instance, I alter the measure to filter for 2 values with the IN operator:
,'Product'[BrandName] IN { "A. Datum", "Adventure Works" }
On this variant, the SE query is just like the ones before.
The filter is passed directly into the query’s WHERE clause.
But what happens when I alter it to this:
,FILTER('Product'
,'Product'[BrandName] IN { "A. Datum", "Adventure Works" }
)
Initially, the result changes:

The rationale is that FILTER() works completely in another way.
It retains the prevailing filter context and adds a brand new one.
I explained this behavior in one other article that I added because the second link within the References section below.
Furthermore, the SE cannot handle this in a single query anymore:

The primary two queries retrieve the values for the brand to filter (See the queries marked in pink).
Notice the big variety of rows (324 and a couple of’560) returned by the primary two queries. That is the materialization of intermediate results needed to perform the calculation.
The third query uses these intermediate results to filter the info (marked in red).
The results of the third query is just two rows—the 2 rows we see in the general result.
As described in my other article, FILTER() should be used with care.
Not only is it considerably slower, however it also works completely in another way from a straightforward filter.
Anyway, I can restore the previous behavior by adding an ALL() within the FILTER() call:

I don’t wish to hide the undeniable fact that this instance is special, because the filter applied affects the identical column as utilized in the query.
When changing the query to filter the country, the engine can optimize the execution and use the easy form again:

As you possibly can see, the engine optimizes the execution of the query and falls back to a straightforward filter when filtering columns that differ from those utilized in the DAX query. Within the blue inset, you see the outcomes.
I see this type of filtering fairly often when developers who should not as proficient write DAX measures.
Using the FILTER() function looks intuitive, but it may possibly yield incorrect or confusing results and is slower than a straightforward filter. I strongly recommend reading my article linked below about this function, in addition to the dax.guide documentation and the articles linked on SQLBI.com.
Moreover, I actually have to type far more than when using a straightforward filter.
As a lazy guy, that is a very important reason not to make use of FILTER() when it’s unnecessary.
Add a fancy filter
Finally, I would like to point out what happens when applying a filter using a DAX function, reminiscent of CONTAINSSTRING().
EVALUATE
CALCULATETABLE(
SUMMARIZECOLUMNS('Product'[BrandName]
,"Online Sales", [Sum Online Sales]
)
,CONTAINSSTRING('Online Sales'[SalesOrderNumber], "202402252C")
)
Such a question is executed while you use a slicer in your report back to filter for a particular order and retrieve the brands of the purchased products.
Because the result just isn’t essential at this point, let’s directly take a look at the query statistics:

While the query took greater than 6 seconds to finish, 99.6% of the time was spent by the FE executing the CONTAINSSTRING() function to search out matching rows in the info. This operation could be very CPU-intensive, because the FE can use just one core. After I execute this question on my laptop, it takes greater than 2 seconds longer.
I deliberately selected a slow function to exhibit its effects.
However the SE was still in a position to execute the query with a single query. Nevertheless, the positive effect of this fact is negligible on this case.
Conclusion
While it just isn’t my intention to present you advice on what to do and what to not do, I wanted to point out you the implications of the various ways to jot down DAX code and apply filters in your measures or queries.
The DAX engine(s) are very efficient in optimizing the queries, but they’ve limitations.
Subsequently, we must all the time take care when writing our DAX code.
If the performance is poor or the code written by another person looks strange, we must always analyze it to find out easy methods to improve it.
I wanted to point out you easy methods to do it and what to search for when analyzing your DAX code.
Remember:
- The Storage engine (SE) can use multiple CPU cores.
- The more work is completed by the SE, the higher.
- The SE can execute only easy aggregations and basic math functions (like +, -, x, and /)
- Try to scale back the workload on the Formula Engine (FE)
- The FE can use just one CPU core.
- Try to scale back the materialization of information (The Rows column within the query statistics).
- Try to scale back the variety of SE queries.
I do know that the necessities will force us to jot down DAX code, which just isn’t optimal.
Even worse, the Report designers might add logic to the report that causes a poor performance.
In such cases, eliminate that logic and check the response time again. It could be value exploring making a dedicated measure for such cases. Do not forget that it is feasible to create local measures in a report that’s connected to a Semantic model via a life connection.
But most significantly: Take your time when writing DAX code. You may save time by avoiding the necessity to optimize your DAX code, which was written in a rush. I speak from experience. It is a very bad feeling.
I hope you learned something recent.
References
To learn the main points about easy methods to interpret the outcomes of the Server Timings in DAX Studio, read this piece:
Are you interested by easy methods to use the FILTER() function appropriately? Read this:
One other DAX function that may harm performance is KEEPFILTERS(). To learn more concerning the KEEPFILTERS() function, read this piece:
Here, the mentioned piece about date filters:
An interesting blog post by Data Mozart concerning the Storage engine:
Like in my previous articles, I take advantage of the Contoso sample dataset. You may download the ContosoRetailDW Dataset at no cost from Microsoft here.
The Contoso Data will be used freely under the MIT License, as described on this document. I modified the dataset to shift the info to contemporary dates.
