Skip to content

Tag: Kusto

Using Power BI Incremental Refresh with Azure Data Explorer

Azure Data Explorer (ADX) is a great platform for storing large amounts of transactional data. The Incremental Refresh feature (now available for Pro users!) in Power BI makes it much faster to keep data models based on that data current. Unfortunately, if you follow the standard guidance from Microsoft for configuring Incremental Refresh, you’ll quickly bump into a roadblock. Luckily, it’s not that difficult to get around.

Incremental Refresh works by setting up data partitions in the dataset in the service. These partitions are based on time slices. Once data has been loaded into the dataset, only the data in the most recent partition is refreshed.

To set this up in Power BI Desktop, you need to configure two parameters, RangeStart, and RangeEnd. These two parameters must be set as Date/Time parameters. Once set, the parameters are used to filter the Date/Time columns in your tables accordingly, and once published to the service, to define the partitions to load the data into.

When Power Query connects to ADX, all Date/Time fields come in as the Date/Time/Timezone type. This is a bit of a problem. When you use the column filters to filter your dates, the two range parameters won’t show up because they are of a different type (Date/Time). Well, that’s not a big problem, right? Power Query lets us change the data column type simply by selecting the type picker on the column header.

Indeed, doing this does in fact allow you to use your range parameters in the column filters. Unfortunately, data type conversions don’t get folded back to the source ADX query. You can see this by right-clicking on a subsequent step in the Power Query editor. The “View Native Query” option is greyed out, which indicates that the query cannot be folded.

Query folding is critical to incremental refresh. Without it, the entirety of the data is brought locally so that it can be filtered vs having the filter occur at the data source. This would completely defeat the purpose of implementing Incremental Refresh in the first place.

The good news is that you can in fact filter a Date/Time/Timezone column with a Date/Time parameter, but the Power Query user interface doesn’t know that. The solution is to simply remove the type conversion Power Query step AFTER performing the filter in the Power Query UI.

Alternatively, if you’re comfortable with the M language, you can simply insert something like the following line using the Advanced Editor in Power Query (where CreatedLocal is the name of the column being filtered).

#"Filtered Rows" = Table.SelectRows(Source, each [CreatedLocal] >= RangeStart and [CreatedLocal] < RangeEnd),

If the filtration step can be folded back into the source, Incremental Refresh should work properly. You can continue setting up Incremental Refresh using the DAX editor. You will likely see some warning messages indicating that folding can’t be detected, but these can safely be ignored.

Leave a Comment

Connect to Application Insights and Log Analytics with Direct Query in Power BI

Application Insights (AI) and Log Analytics (LA) from Microsoft Azure provide easy and inexpensive ways to instrument applications. Using just an instrumentation key, any application can send operational data to AI which can then provide a rich array of tools to monitor the operation of the application. In fact, the blog that you are reading uses an Application Insights plugin for WordPress that registers each view of a page into an instance of AI in my Azure tenant.

Application Insights data can be queried directly in the Azure portal to provide rich insights. In addition, the data can be exported to Excel for further analysis, or, it can be queried using Power Query in either Excel or Power BI. The procedure for using Power Query can be found in this article. The approach for doing so, uses the Web connector in Power Query, which can be automatically refreshed on a regular basis. The Web connector does not however support Direct Query, so the latency of the data in this scenario will be limited by the refresh schedule configured in Power BI. Any features that depend on Direct Query (Aggregations, Automatic Page Refresh) will also not work.

If you’ve worked with AI or LA, and dropped down to the Query editor, you’ve been exposed to KQL – The Kusto Query Language. This is the language that is used by Azure Data Explorer (ADX), or as its code name, “Kusto”. This is of course not a coincidence, as the Kusto engine powers both AI and LA.

Power BI contains a native connector for ADX, and if you can configure an ADX cluster for yourself, populate it, and work with it in Power BI for both imported and Direct Query datasets. Given that ADX is what powers AI and LA, it should be possible to use this connector to query the data for AI and LA. It turns out that the introduction of a new feature known as the ADX proxy will allow us to do just that.

The ADX proxy is designed to allow the ADX user interface to connect to instances of AI and LA and run queries from the same screens as native ADX clusters. The entire process is described in the document Query data in Azure Monitor using Azure Data Explorer. The document explains the process, but what we are particularly interested in is the syntax used to express an AI or LA instance as an ADX cluster. Multiple variations are described in the document, but the ones that we are most interested in are here:

For LA: https://ade.loganalytics.io/subscriptions/<subscription-id>/resourcegroups/<resource-group-name>/providers/microsoft.operationalinsights/workspaces/<workspace-name>

For AI: https://ade.applicationinsights.io/subscriptions/<subscription-id>/resourcegroups/<resource-group-name>/providers/microsoft.insights/components/<ai-app-name>

By substituting in your subscription ID, resource group name, and resource name, you can treat these resources as if they were ADX clusters, and query them in Power BI using Direct Query. As an example, a simple query on this blog can be formed using the ADX connector:

And the result will appear as:

The precise query is provided in the query section of the connector above.

Once the report is built, it can be deployed to the PowerBI service, and refreshed using AAD credentials.

It is important to note that this method does NOT require you to configure an ADX cluster of your own. We are simply utilizing the cluster provided to all instances of AI and LA. We therefore do not have any control over performance levels, as we would have in a full ADX cluster. However, if the performance is adequate,(and the queries designed appropriately), this can be a good approach to work with AI and LA data that has low latency (near real time) requirements.

25 Comments