From ‘Dataslows’ to Dataflows: The Gen2 Performance Revolution in Microsoft Fabric

of announcements from the recent FabCon Europe in Vienna, one which will have gone under the radar was in regards to the enhancements in performance and price optimization for Dataflows Gen2.

Before we delve into explaining how these enhancements impact your current Dataflows setup, let’s take a step back and supply a transient overview of Dataflows. For those of you who’re recent to Microsoft Fabric — a Dataflow Gen2 is the no-code/low-code Fabric item used to extract, transform, and cargo the info (ETL).

A Dataflow Gen2 provides quite a few advantages:

Leverage 100+ built-in connectors to extract the info from a myriad of information sources
Leverage a well-known GUI from Power Query to use dozens of transformations to the info without writing a single line of code — a for a lot of citizen developers
Store the output of information transformation as a delta table in OneLake, in order that the transformed data will be used downstream by various Fabric engines (Spark, T-SQL, Power BI…)

Nevertheless, simplicity often comes with a price. Within the case of Dataflows, the fee was significantly higher CU consumption in comparison with code-first solutions, reminiscent of Fabric notebooks and/or T-SQL scripts. This was already well-explained and examined in two great blog posts written by my fellow MVPs, Gilbert Quevauvilliers (Fourmoo): Comparing Dataflow Gen2 vs Notebook on Costs and value, and Stepan Resl: Copy Activity, Dataflows Gen2, and Notebooks vs. SharePoint Lists, so I won’t waste time discussing the past. As an alternative, let’s concentrate on what the current (and future) brings for Dataflows!

Changes to the pricing model

Image generated by writer

Let’s briefly examine what’s displayed within the illustration above. Previously, every second of the Dataflow Gen2 run was billed at 16 CU (CU stands for Capability Unit, representing a bundled set of resources — CPU, memory, and I/O — utilized in synergy to perform a selected operation). Depending on the Fabric capability size, you get a certain variety of capability units — F2 provides 2 CUs, F4 provides 4 CUs, and so forth.

Going back to our Dataflows scenario, let’s break this down through the use of a real-life example. Say you’ve a Dataflow that runs for 20 minutes (1200 seconds)…

Previously, this Dataflow run would have cost you 19.200 CUs: 1200 seconds * 16 CUs
Now, this Dataflow run will cost you 8.100 CUs: 600 seconds (first 10 minutes) * 12 CUs + 600 seconds (after first 10 minutes) * 1.5 CUs

The longer your Dataflow must execute, the larger the savings in CUs you potentially make.

That is amazing by itself, but there continues to be more to that. I mean, it’s nice to be charged less for a similar amount of labor, but what if we could make these 1200 seconds, let’s say, 800 seconds? So, it wouldn’t save us just CUs, but in addition reduce the time-to-analysis, for the reason that data would have been processed faster. And, that’s exactly what the subsequent two enhancements are all about…

Modern Evaluator

The brand new preview feature, named Modern Evaluator, enables using the brand new query execution engine (running on .NET core version 8) for running Dataflows. As per the official Microsoft docs, Dataflows running the trendy evaluator can provide the next advantages:

Faster Dataflow execution
More efficient processing
Scalability and reliability

The illustration above shows the performance differences between various Dataflow “flavors”. Don’t worry, we are going to challenge these numbers soon in a demo, and I’ll also show you tips on how to enable these latest enhancements in your Fabric workloads.

Partitioned Compute

Previously, a Dataflow logic was executed in sequence. Hence, depending on the logic complexity, it could take some time for certain operations to finish, in order that other operations within the Dataflow had to attend within the queue. With the Partitioned Compute feature, Dataflow can now execute parts of the transformation logic in parallel, thus reducing the general time to finish.

At this moment, there are particular limitations on when the partitioned compute will kick in. Namely, only ADLS Gen2, Fabric Lakehouse, Folder, and Azure Blob Storage connectors can leverage this recent feature. Again, we’ll explore how it really works later in this text.

3, 2, 1…Motion!

Okay, it’s time to challenge the numbers provided by Microsoft and check if (and to what degree) there may be a performance gain between various Dataflows types.

Here is our scenario: I’ve generated 50 CSV files that contain dummy data about orders. Each file accommodates roughly 575.000 records, so there are ca. 29 million records in total (roughly 2.5 GBs of information). All of the files are already stored within the SharePoint folder, allowing for a good comparison, as Dataflow Gen1 doesn’t support OneLake lakehouse as an information source.

I plan to run two series of tests: first, include the Dataflow Gen1 within the comparison. On this scenario, I won’t be writing the info into OneLake using Dataflows Gen2 (yeah, I do know, it defeats the aim of the Dataflow Gen2), as I would like to match “apples to apples” and exclude the time needed for writing data into OneLake. I’ll test the next 4 scenarios, through which I perform some basic operations to mix and cargo the info, applying some basic transformations (renaming columns, etc.):

Use Dataflow Gen1 (the old Power BI dataflow)
Use Dataflow Gen2 with none additional optimization enhancements
Use Dataflow Gen2 with only the Modern evaluator enabled
Use Dataflow Gen2 with each the Modern evaluator and Partitioned compute enabled

Within the second series, I’ll compare three flavors of Dataflow Gen2 only (points 2-4 from the list above), with writing the info to a lakehouse enabled.

Let’s start!

Dataflow Gen1

The whole transformation process within the old Dataflow Gen1 is fairly basic — I simply combined all 50 files right into a single query, split columns by delimiter, and renamed columns. So, nothing really advanced happens here:

The identical set of operations/transformations has been applied to all three Dataflows Gen2.

Please take into account that with Dataflow Gen1 it’s impossible to output the info as a Delta table in OneLake. All transformations are endured throughout the Dataflow itself, so while you need this data, for instance, within the semantic model, you’ll want to take note of the time and resources needed to load/refresh the info within the import mode semantic model. But, more on that later.

Dataflow Gen2 without enhancements

Let’s now do the identical thing, but this time using the brand new Dataflow Gen2. In this primary scenario, I haven’t applied any of those recent performance optimization features.

Dataflow Gen2 with Modern Evaluator

Okay, the moment of truth — let’s now enable the Modern Evaluator for Dataflow Gen2. I’ll go to the Options, after which under the Scale tab, check the Allow use of the trendy query evaluation engine box:

The whole lot else stays the exact same as within the previous case.

Dataflow Gen2 with Modern Evaluator and Partitioned Compute

In the ultimate example, I’ll enable each recent optimization features within the Options of the Dataflow Gen2:

Now, let’s proceed to the testing and analyzing results. I’ll execute all 4 dataflows in sequence from the Fabric pipeline, so we are able to ensure that each of them runs in isolation from the others.

And, listed here are the outcomes:

Partitioning obviously didn’t count much on this particular scenario, and I’ll investigate how partitioning works in additional detail in one in all the next articles, with different scenarios in place. Dataflow Gen2 with Modern Evaluator enabled, outperformed all of the others by far, achieving 30% savings in comparison with the old Dataflow Gen1 and ca. 20% time savings in comparison with the regular Dataflow Gen2 with none optimizations! Don’t forget, these savings also reflect within the CU savings, so the ultimate CU estimated cost for every of the used solutions is the next;

Dataflow Gen1: 550 seconds * 12 CUs = 6.600 CUs
Dataflow Gen2 with no optimization: 520 seconds * 12 CUs = 6.240 CUs
Dataflow Gen2 with Modern Evaluator: 368 seconds * 12 CUs = 4.416 CUs
Dataflow Gen2 with Modern Evaluator and Partitioning: 474 seconds * 12 CUs = 5.688 CUs

Nevertheless, I desired to double-check and make sure that my calculation is accurate. Hence, I opened the Capability Metrics App and took a have a look at the metrics captured by the App:

Although the general result accurately reflects the numbers displayed within the pipeline execution log, the precise variety of used CUs within the App is different:

Dataflow Gen1: 7.788 CUs
Dataflow Gen2 with no optimization: 5.684 CUs
Dataflow Gen2 with Modern Evaluator: 3.565 CUs
Dataflow Gen2 with Modern Evaluator and Partitioning: 4.732 CUs

So, based on the Capability Metrics App, a Dataflow Gen2 with Modern Evaluator enabled consumed lower than 50% of the capability in comparison with the Dataflow Gen1 on this particular scenario! I plan to create more test use cases in the next days/weeks and supply a more comprehensive series of tests and comparisons, which will even include a time to jot down the info into OneLake (using Dataflows Gen2) versus the time needed to refresh the import mode semantic model that’s using the old Dataflow Gen1.

Conclusion

In comparison to other (code-first) options, Dataflows were (rightly?) considered “the slowest and least performant option” for ingesting data into Power BI/Microsoft Fabric. Nevertheless, things are changing rapidly within the Fabric world, and I really like how the Fabric Data Integration team makes constant improvements to the product. Truthfully, I’m curious to see how Dataflows Gen2’s performance and price develop over time, in order that we are able to consider leveraging Dataflows not just for low-code/no-code data ingestion and data transformation requirements, but in addition as a viable alternative to code-first solutions from the fee/performance standpoint.

Thanks for reading!

Disclaimer: I don’t have any affiliation with Microsoft (except being a Microsoft Data Platform MVP), and I haven’t been approached/sponsored by Microsoft to jot down this text

From ‘Dataslows’ to Dataflows: The Gen2 Performance Revolution in Microsoft Fabric

Changes to the pricing model

Modern Evaluator

Partitioned Compute