A low cost approach for paginated report subscriptions in Power BI

Paginated reports in Power BI offer a rich set of capabilities for printing, and the generation of report documents. Paginated reports can be exported to multiple document formats, and those exports can be scheduled and delivered via email. Unfortunately, for the moment at least, paginated reports require the use of a dedicated capacity, which can be cost prohibitive for some. This article describes a pattern that will help minimize the cost of using paginated reports for subscriptions.

Dedicated capacities

The term “Premium” is often used to describe the resource (as opposed to user) based licensing option in Power BI. More accurately, this option is called “dedicated capacity”, and there are a number of different ways to use dedicated capacity. Premium is certainly one of those ways, but the Azure (A) SKU and the EMbedding (EM) SKU are two others.

The different SKUs have different features and capabilities, but for the purposes of this article, they are functionally equivalent. For a detailed discussion of the different dedicated capacity SKUs, see my older post Understanding the Power BI capacity based SKUs.

The “A” SKU is of particular interest, because unlike the other two SKU types, it is billed in hourly increments as opposed to monthly. If the capacity can be started, perform a task, and then stopped, then you will only be billed for the time used to perform the task in question. Matt Allington has a post that outlines this concept in great detail: Affordable Power BI Premium for Small Business .

The process

In our scenario, we have a paginated report that uses a published Power BI dataset as a data source. as of this writing, there is no API call available to render a paginated report on demand, so we will rely on the scheduled subscription capability. in order to minimize the cost of the solution, we want the dedicated capacity to run as little as possible.

The solution will consist of an Azure logic app, and Power BI paginated report scheduling. An Azure logic app uses the same set of actions that a Flow in Power automate does, but is a little more flexible in its permissions model.

The overall process flow is as follows:

  1. The Logic App runs on a schedule, and starts the dedicated capacity by calling the Power BI API from an HTTP action
  2. The Logic app uses the “Refresh a dataset” Power BI action to initiate a dataset refresh. Note that this is only necessary if the report’s data souce is a Power BI dataset.
  3. On a schedule that allows for the dataset above to be refreshed, the report schedule runs and delivers the report to the destination addresses.
  4. Another Logic App runs on a schedule that allows for both the refresh of the data in step 2, and the report to be rendered in step 3, that pauses the capacity. Ideally this should be less than a hour after the capacity was started in step 1, given the hourly billing increments.

Optionally, further automation can run on the destination inbox to deliver the report to alternate locations, such as a SharePoint library, etc.

Create the capacity

Prior to doing any of this, you will need to create a Power BI dedicated capacity in Azure. In Azure, the service is called “Power BI Embedded” and detailed instructions for creating it can be found in this document.

Although there are 6 possible sizes to choose from, paginated reports in the Power BI service require an A4 capacity at a minimum. It is therefore important to select the “Change size” link and choose a size that is A4 or greater.

Once created, any relevant workspaces can be assigned to this capacity. It should be noted that the capacity must be created in the same tenant as the Power BI service itself.

Starting and Refreshing

Creating an Azure Logic app is relatively straightforward. The steps involved in doing so can be found in this introductory document. Once created, we will add a schedule trigger, an HTTP action, and a “Refresh a dataset” action. The complete Logic app is below.

Logic app to start the capacity and refresh a dataset

The recurrence step will kick off the run on a scheduled basis. You can set the recurrence to a number of time periods, and the run time will be based on that start date/time. In the example above, the process runs every day at 2:50 AM

The HTTP step calls into Azure to start the capacity. The request is a POST request, and all that it requires is a single URL.You must provide 3 configuration values for it to work. The URL itself is:

https://management.azure.com/subscriptions/{SubscriptionGUID}/resourceGroups/{ResourceGroup}/providers/Microsoft.PowerBIDedicated/capacities/{CapacityName}/resume?api-version=2017-10-01

Where:

  • {SubscriptionGUID} is the GUID of the Azure subscription for the capacity
  • {ResourceGroup} is the name of the Resource Group for the capacity
  • {CapacityName} is the name of the capacity

The other important item here is the Authentication setting. There are a number of options here, and your choice will match your requirements. For this example, we are using the relatively new “Managed Identity” in Azure. Managed identity allows one service (the Power BI capacity) to grant direct access to another (our Logic app). In this way, the Logic app is granted permission to start and stop the capacity.

More information on managed identities can be found in this Microsoft document.

The final action in this app is the “Refresh a dataset” action. Once added, you provide your credentials, select a workspace, and the the dataset to be refreshed. The logic app waits for the HTTP action to finish before it tries to refresh the dataset, so you do not need to worry about adding pauses.

Subscription

The subscription for the paginated report needs to be set to run some time after the dataset above has refreshed if using a dataset as a data source, or after the capacity has started if not. There is nothing unique about subsctriptions for this process, but you need to be aware of the timing. The subscription must run and complete in the time between the starting and refresh of any datasets, and when the companion logic apps fires to pause the capacity.

As of this writing, subscriptions can only be delivered to mailboxes within the same tenant as the Power BI service itself.

To create a subscription, select the “Subscribe” button in the upper right of the report toolbar.

Next, complete the subscription options. Note that in this example, the subscription is set to run at 3:15 AM, which is 25 minutes after the start capacity Logic app fires, allowing for plenty of time to start the capacity, and to refresh the dataset that this report connects to.

Pausing the capacity

A second Logic app is required to pause the capacity. It needs to be scheduled to allow for the subscription to be sent, ideally within an hour of the capacity starting so that only one hour of usage is billed.

The Logic app to pause the capacity is very similar to the one to start it, with a different URL being called.

In our example, the Recurrence trigger is set to run daily at 3:45 AM, which is 55 minutes after the start app.

The HTTP action settings are identical to those in the startup logic app, with the exception that the suspend method is called instead of the resume method. The full URL is below.

https://management.azure.com/subscriptions/{SubscriptionGUID}/resourceGroups/{ResourceGroup}/providers/Microsoft.PowerBIDedicated/capacities/{CapacityName}/resume?api-version=2017-10-01

Where:

  • {SubscriptionGUID} is the GUID of the Azure subscription for the capacity
  • {ResourceGroup} is the name of the Resource Group for the capacity
  • {CapacityName} is the name of the capacity

Summary

While paginated reports are currently a “Premium” offering, it is possible to use automation techniques, along with the Power BI Embedded Azure service to only run the service for an hour a day, turning a cost of hundreds per day into several dollars per day.

By taking advantage of the report rendering capabilities that paginated reports affords, and by building the reports on top of pre-existing Power BI dataset, paginated reports can become a print engine for the analytical reports.

Keep your Power BI SharePoint reports current with Microsoft Flow

Power BI is without question the best way to report on data in SharePoint lists. The query tools available in Power Query make working with SharePoint data relatively painless, an the cached dataset means that reports are run against an optimized copy of the list data, not the data itself.

This latter distinction, while removing the performance issues of systems that query lists directly, also introduces problems with data latency. The report will never be fully “up to date”, as it needs to be refreshed on a periodic basis.

Consider the following scenario. A Power BI report has been built that uses data from a SharePoint list. That report has been embedded on a SharePoint page in the same site. A user adds an item to the list, and then navigates to the page to see the updated report. Unfortunately, that report won’t get updated until the next scheduled refresh.

This has been a significant problem, until the recent release of the new “Refresh a dataset” action in Microsoft Flow.

It is a relatively simple procedure to add a simple 1 step flow to any SharePoint list that is triggered when an item is created, updated, or deleted. This flow simply needs to add the “Refresh a dataset” action, that is configured for the relevant dataset, and these embedded reports will be updated very shortly after the data is modified.

Alternatively, the flow can be triggered by a timer, allowing you to create your own schedule (every 5 minutes, etc) that is not hardwired to run at the top or bottom of any given hour.

A few caveats should be kept in mind when using this action however.

While this action gives us much finer grained control over when refreshes happen, all of the current license restrictions remain in place. For datasets located in the shared capacity, only 8 refreshes per day are allowed.

For datasets in dedicated capacities (Premium), there are no limits to the number of refreshes. The limit of 48 per day is a UI restriction, not a licensing restriction. However, refresh can utilize significant resources, particularly memory, so you’ll want to ensure that you have significant resources to support the update frequency.

Finally, the load on the source data system should be considered. Refresh will pull a significant amount of data every time it is run.

Caveats aside, this new flow action is a welcome relief to those that need greater control of how their reports are updated.

Fixing Power BI Report Builder Connection Errors

Power BI Report Builder is Microsoft’s design tool for building Paginated reports in Power BI. It is based on Microsoft Report Builder (formerly SQL Server Reporting Services Report Builder), but has been optimized for the Power BI service.

One of the most important capabilities of Power BI Report Builder is the ability to connect to datasets that have been published to the service. If you have done this, and spent any significant amount of time building reports, you may have come across some puzzling connection errors that are caused by the same thing.

After initially creating a connection and building a “Paginated dataset” (not to be confused with a Power BI service dataset), and then spending some time designing your report, when you select the “Run” option from the ribbon, you may be presented with the “Failed to preview report” error shown at the top of this article. Selecting the details button reveals more information:

A similar error can be found under the same conditions when editing a Paginated dataset’s query with the Query designer tool. Selecting this tool can result in the error “Unable to connect to data source xxxxxxxx”, and the details button reveals another “Unauthorized” error.

What’s worse in this case is that when you select OK, a dialog box appears prompting you to enter a set of credentials.

There are no combination of credentials that you can enter that will fix the connection to the data source. This dialog box was designed for classic paginated connections, not for connections to published data sets. You should select cancel if you see this dialog box.

What is happening in both of these cases is that the token acquired from the Power BI service has expired, and Report Builder does not automatically fetch a new one. There are a couple of ways to deal with this problem.

If you have saved the RDL file to a local file system, you can close Report Builder and reopen it. That will re-establish the data connection. You could also choose to save the RDL directly into the Power BI workspace. This will also re-establish the connection. You can do this by selecting File – Save as and selecting Power BI Service.

You can then choose which workspace to save the file in. This also removes the need to upload the file into the service when you want to publish it – saving and publishing are the same thing in this scenario.

If you are editing a file directly in the service, these errors will still appear after periods of no data retrieval activity, but the connection can re-established simply by saving the report. You can look at the errors as a way of prompting you to save your work .

Connecting Power BI workspaces and SharePoint sites

Power BI V2 workspaces recently (May 2019) entered into general availability. The biggest difference between a V1 and V2 Power BI workspace is the fact that a V2 workspace is not backed by an Office 365 group, and a V1 workspace is. One area that this change affects a great deal is the “Get data” experience in the Power BI service (browser). This post outlines the differences, and describes the configuration options.

Data connections to files stored in SharePoint and OneDrive have certain unique characteristics when they are created in the browser. For example, these connections are automatically refreshed hourly unless that option is disabled.

V1 workspaces automatically offer the connection to the Documents library in the underlying SharePoint site. V2 workspaces do not automatically offer this option, as there is no underpinning SharePoint site. However, any V2 workspace can be connected to any Modern SharePoint site, and in this way, the option is more flexible. For the sake of clarity, a Modern SharePoint site is one that is backed by an Office 365 Group, and has an email address.

Let’s explore the 4 possible experiences when using “Get Data” and then choosing “Files” in the Power BI service. There are 4 possible experiences, depending on the type and configuration of the workspace;

  1. Personal workspace
  2. V1 workspace
  3. V2 workspace not connected to a site
  4. V2 workspace connected to a site

In each example below, the options are reached by selecting “Get Data” and then choosing “Files”. The type of files that can be imported are CSV, Excel, PBIX (Power BI Desktop files) and RDL (paginated reports).

Personal workspace

The personal workspace is the only workspace available using the free Power BI license. It is not connected to any SharePoint sites, and provides 4 options for importing.

“Local File” can be used for importing files from a local file store. Files imported in this manner are not automatically refreshed, and without the use of a gateway, cannot be. This option is available for every workspace type and will not be discussed further. “Learn about importing files” is a simple help link, likewise available to all workspace types.

OneDrive – Business connects to the currently logged in user’s OneDrive for Business storage. This is the OneDrive that is associated with “School or Organization” account which is stored in Azure Active Directory.

OneDrive – Personal connects to a user’s personal, or consumer OneDrive account. This is the type of OneDrive that is accessed using a “personal” account (otherwise known as a Microsoft account, or MSA). The personal workspace is the only type of workspace that allows a connection to personal OneDrive content.

SharePoint – Team Sites allows files stored in any SharePoint Online library to be loaded. Files stored in SharePoint on-premises can be loaded into Power BI, but only through Power BI Desktop. This method is online only.

Data imported in this fashion will be updated hourly with the exception of “Local File”. This will also be true of any OneDrive or SharePoint source referenced below.

V1 workspace

A Power BI V1 workspace is connected to an Office 365 Group, and therefore backed by a SharePoint site. This is reflected in the Files experience in the service.

Here we see 3 import options. Local File, SharePoint – Team Sites, and “Learn about..” are exactly the same as with personal workspaces. However, both OneDrive options from there are unavailable. The “OneDrive – XXXX” option is different, and bears some explanation.

In the image above, “Demos” is the name of the V1 workspace. Selecting this option will open the SharePoint library named “Documents” in the SharePoint site that is associated with this workspace and Office 365 group.

In my opinion, this option is poorly named, which leads to confusion. This container truly has nothing to do with OneDrive – it is a SharePoint library. We already have enough different “OneDrives” to keep track of, but I digress.

V2 workspace (not connected to a site)

The V2 workspace is not associated with a SharePoint site, and therefore, there is no Documents library to connect to. The option is instead replaced with the ability to connect to the user’s OneDrive for Business (OneDrive – Business) storage, as in the personal workspace. In essence, this experience is identical to the personal workspace experience minus the ability to connect to personal OneDrives.

V2 workspace (connected to a site)

Although a V2 workspace is not inherently connected to a SharePoint site, it can be manually connected to one. This restores the capability missing from V1 workspaces, while being more flexible. The workspace is no longer bound to a specific site, but can be configured to work with any Modern SharePoint site. In addition, the same site can be bound to multiple workspaces.

The “Modern” distinction above is important. The SharePoint site itself must be backed by an Office 365 group, as that is how it is identified in Power BI.

Associating a workspace with a SharePoint site

With V2 workspaces, site connection is now a property of the workspace. To edit workspace properties, select either the workspace settings button in the ribbon, or the ellipsis beside the workspace in the workspace list.

The connection setting is in the advanced section, and is identified as the “Workspace OneDrive”.

The important thing to note here is that you do NOT enter the URL of the SharePoint site in this field. This field is expecting the address of it in email format (ie demos@xxxx.com). All Modern Sharepoint sites are bound to an Office 365 group, and the email address is the address of that group.

Get Data – File options for a V2 connected site

Once connected, the “Get Data” – “File” options will be much the same as with an unconnected workspace, but with the “OneDrive – SiteName” option added.

I still take exception with the name presented above, in my opinion it should be “Site – SiteName” or “SharePoint – SiteName site” and use a SharePoint option. However, once connected files in the connected site can be imported easily into the Power BI service.

Usage

It is important to understand what the connected site is used for in Power BI. Connecting a site allows for files stored in a SharePoint library to be either imported into the service (all supported file types), or connected to (Excel files). This feature does NOT allow Power BI content to be stored in a SharePoint library

Connect to Power BI dataflows and datasets using Power BI Desktop with multiple accounts

Image result for multiple accounts

Power BI datasets and dataflows are the two native data sources for Power BI reports. Connecting to a datasets allows a report to be built against an existing Power BI dataset in place, and dataflows represent a source of data that has had transformations applied to it. When connecting to Power BI dataflows, data is imported, into a data model but the connection to a Power BI dataset is a direct connection.

The two sources handle identity in drastically different ways, and this can lead to confusion when dealing with multiple accounts and tenants. This post is an attempt to help clarify this confusion

Connecting to a dataset

To connect to a Power BI dataset, select the “Get Data” button from the ribbon, select the “Power BI” tab, select Power BI dataset, and finally the “Connect” button.

Connecting to a Power BI dataset

Next, if a user is signed into Power BI Desktop a list of workspaces and their datasets are presented. If not, the user is prompted to sign in. The dataset can then be connected.

Selecting a Power BI dataset

The important thing to notice here is the list of workspaces itself. The list presented is a list of the workspaces available in the tenant belonging to the currently signed in user. It is the same list of workspaces that can be chosen as a publishing target. It should also be noted that the identity of the user is displayed in the upper right corner of the dialog box, and the identity can be changed directly from there.

Connecting to a dataset in a different tenant

The signed in user can be changed by selecting “Sign in” at the upper right of the Power BI Desktop client, or within the connection dialog itself. If the user signs into a different account (in a different tenant), a different list of workspaces and datasets will then be offered. The dataset source is hard linked to the currently signed in user. In this way, the Power BI dataset source behaves differently than all other data sources, which maintain connection credentials separately.

Connecting to a Dataflow

Connecting to a dataflow follows the same steps as a dataset, with the exception that the “Power BI dataflows” option is chosen.

Connecting to a Power BI datflow

At this point, a Power BI data connection dialog will be shown.

Signing in to a Power BI dataflow

There is only one authentication option because Power BI dataflows only support one authentication option.

Unlike datasets, dataflows are NOT linked to the currently signed in user. The connection is authenticated, not the current user. The “Sign In” button must be selected, and authentication completed to connect to a Power BI dataflow.

Once signed in, selecting the “Connect” button will display a list of workspaces that contain dataflows. Expanding the workspace and then the dataflow will expose a list of entities that can be imported into the Power BI data model.

Selecting a Power BI dataflow

The connection information for the dataflow is cached with Power BI Desktop, and subsequent connections to dataflows will not require the user to sign in. The same authentication credentials will be used.

it should be noted that unlike the dataset connection dialog, this one does not show the current credentials and does not allow those credentials to be changed. This makes the process of changing credentials to use dataflows in multiple tenant somewhat less than intuitive.

Connecting to a dataflow in a different tenant

With datasets, changing the currently signed in user will result in a different set of datasets being presented when the dataset option is chosen. This is different with dataflows. No matter what user is currently logged in, the cached credentials will be used.

This behaviour can be confusing when multiple tenants need to be accessed. With most other data sources, the cached credentials are linked with the specific data source. For example, when two different SQL databases are connected, Power BI caches two different sets of credentials.

To connect to dataflows in a different tenant, the current connection information needs to be cleared. This can be done with any data source, but it is particularly important to dataflows as it is the only way to switch connection credentials.

To clear the credentials for the dataset, select “File”, “Options and Settings” and the “Data Source Settings”. The Data source settings dialog will then be presented.

Power BI Data source settings
Clearing the credentials for a Power BI dataflow

Unlike most other data sources which can have multiple entries in the list, one for each unique data source, there will only be one source for dataflows. It is named “Power BI dataflows. For example, if the current instance of Power BI Desktop has authenticated to 3 different SQL servers, there will be three SQL Server connections in this list, but there will only be one for dataflows, no matter how many tenants that have been connected.

To switch tenants, the current credentials must be either cleared, or edited. The cached credentials can be fully removed by selecting “Clear Permissions” or they can be changed by selecting “Edit Permissions”. If cleared, the user will be prompted for credentials the next time the dataflow option is selected. If edited, the new credentials will be stored.

Conclusions, and recommendations

It is possible to work with multiple tenants for both connected datasets and dataflows. However, the methods for doing so are completely different for either option. This can obviously lead to some confusion.

It is my opinion that this behaviour should be changed, and that the behaviour or connected datasets is the more intuitive. If the credentials for the currently logged in user were user for both types of connection, it would be much more intuitive, and also easier to user for report designers.