Skip to content

Category: Power BI

Limitations for Power Query OData Feeds in Power BI

One of the features available in Power BI is the ability to take any defined data source and expose it as an OData feed. This is a very simple and quick way to get your existing data available through OData, as it involves a simple check box selection. Complete instructions on setting this up can be found here. There are however a few limitations to what you are able to do that you should be aware of before you head down this path.

Intranet Only

The OData feed feature works through the Data Management Gateway, which is normally used to keep data models stored in the cloud updated regularly with new on-premises data. When a data source is registered, an “enable OData feed” option is made available which when checked, creates an OData feed URL.

When this feed is used, a connection is made directly from the OData client to the Power BI service, which then redirects communication to the Data Management Gateway. The reason that this is important is that because the actual data connection does not go through the Power BI service, the client machine needs to be able to communicate directly with the machine hosting the Data Management Gateway. This means that the OData feed only works on the intranet – it can’t be shared publicly. For now at least.

Data Types

The Data Management Gateway, and therefore the Power BI service don’t support all of the data types supported by SQL Server, or Oracle. If your table or view uses any field that is an unsupported data source, the entire table will be unavailable to use in an OData feed. The table will appear as greyed out when the list of tables to use for OData is being configured.

image

In the above case, the DistrictMaps table contained a geography field, which is unsupported. A complete list of supported data types can be found here. If you are using unsupported data types, you may want to consider creating views that do not contain these fields, and exposing those.

Data Sources

Up until recently (version 1.2), the Data Management Gateway only supported performing data refreshes from two on-premises data sources – SQL Server and Oracle, which constrained its value somewhat. Version 1.2 brought support for a wide variety of Power Query data sources, which really changed the game. Now, since OData feeds utilize the Data Management Gateway, we should be able to expose all sorts of data sources as OData feeds, right?

Wrong. Well, not quite at least. I received a question from Hrvoje Kusulja,  who was trying to expose DB2 data as an OData feed through Power Query, but the OData feed option was disabled. After some testing, and communication with Microsoft, I was able to determine conclusively that while Power Query queries are supported for OData feeds, the underlying Power Query queries MUST come from either SQL Server or Oracle. This is identical to the Power Query refresh support in version 1.1 of the Data Management Gateway. Unfortunately we couldn’t find documentation on this anywhere.

One potential workaround if you need OData support and your data source isn’t supported would be to use an ETL system (Integration Services) to pump data into SQL or Oracle, and create the query from there.

OData feeds is a great little feature, and a nice side benefit from using Power BI and the Data Management Gateway. As with any new product, it has limits that will undoubtedly be reduced in the future, but it’s important to know where they are.

5 Comments

Power BI as a Product Today

Recently, I have come across several situations where people are confused about where Power BI fits in a solution scenario. There is a fair bit of confusion as to precisely what the product is and what it does. The problem is that Power BI isn’t really a product at all, but instead a collection of different products and services. Adding to the confusion is the fact that some of these products require a Power BI license, while others do not. In fact some of these products are actually embedded in other products.

Power BI is Microsoft’s cloud based Business Intelligence solution billed as “Self service analytics for all of your data”. In reality, it’s a little more than self service, it also is a great solution for team BI as it’s based on Office 365. That’s all well and good, but what is it really? What does it consist of, and how does it work? If you look at the main product site for Power BI, it’s not immediately obvious at what you get when you purchase it, or what you need to run it. This post is an attempt to demystify the product.

To start, let’s break it down by its constituent components. Today Power BI consists of the following parts.

image

Unfortunately, this can be rather confusing from a product perspective. Looking first at the on-premises components, Power Query, Power View, and Power Map are all Excel plug ins. Excel is therefore a prerequisite for Power BI. All of these add ins also require (or in the case of Power Query, support) the embedded xVelocity data model, and therefore Power Pivot is a prerequisite. Power Pivot is included in Excel 2013 (Professional Plus), but it can also be downloaded for free for Excel 2010.

Also included in Excel 2013 is Power View, and, with Office 2013 SP1, Power Map. Power Query is downloaded separately, but is free. This is where much of the confusion arises. Due to the fact that these three add ins are included in the product definition of Power BI, it is often assumed that a Power BI license is required to use them. It is not. These products have a life of their own, and can be fully (or almost fully) used within Excel without any association with a Power BI license.

Power Query contains a few features that will only work with a Power BI tenant, mostly involved around the creation and maintenance of shared queries. Since this is part of the cloud service, this makes complete sense, but none of the other features of the product are in any way reduced in the absence of a license. Power View is enhanced through a Power BI license, but only because this makes Power View reports available within the mobile client(s). Indeed, Power Map has no use whatsoever of a Power BI license. Power Maps cannot be viewed at all within a browser – they are a client side feature only. In my opinion, they shouldn’t even be included under the Power BI umbrella, but that’s just my opinion.

Thus far, I have been talking about the modelling and visualization creation aspects of the tools, but what about pure consumption clients? The whole idea of power BI is that designers can create these models and users can interact with them. The workbooks containing these models are stored within Office 365, so do casual users need a license?

The answer is of course maybe. If these users are going to take advantage of any of the services specifically offered by Power BI, then the answer is yes. For example, any user can open a workbook in a browser in Office 365. However, if they want to interact with that model, by using a slicer, pivot table, etc, and that model is larger than 10 MB, then the answer is yes. Obviously, if the user wants to use the Power Q&A features, then the answer is also yes.

For the record, I don’t like this answer. To my mind, designers and content creators should require a license, but consumers should not. This would greatly encourage adoption of the product, so I do hope for some changes in this area.

So, precisely what do you get when you purchase a Power BI license? These are the things that you will absolutely need a Power BI license for.

  • Opening workbooks in a browser with models larger than 30 MB on Office 365
  • Interacting with (slicers, pivot tables, etc) workbooks in a browser with models larger that 10 MB on Office 365
  • Automatic refresh of on premises data
  • Sharing of Power Query queries
  • Refresh of Power Query queries
  • Power Q&A – Natural language queries
  • Power BI mobile application

and that’s it.

In fact, if you check out my earlier article “Whither Power Pivot for SharePoint”, you’ll see that many of the features of Power BI are already available in Power Pivot for SharePoint.

To my mind, the product “Power BI” should not include the Excel add ins, but only list them as a requirement, much like Excel itself is a requirement. This would help to reduce confusion. The next version of Power will support their inclusion. If you’re interested in this new version, you can sign up for the preview when it’s ready here. I’ll be writing more about that shortly.

3 Comments

Using Power Query with SharePoint Lists and Lookup Fields

As I’ve explained many times before, querying SharePoint data directly is a bad idea. The SharePoint data storage mechanisms simply aren’t designed for querying of any scale, hence the lookup limitations that have been imposed upon it. The best approach to querying SharePoint list data is to first load it into a data warehouse or data mart of some sort. However, both Reporting Services (SSRS) and Power Query support direct access to SharePoint lists. While I try to strongly dissuade people from doing this with Reporting Services, properly used, Power Query is a totally viable means of querying SharePoint list data.

Why is this? With SSRS, every query goes back to the data source for retrieval.  Power Query is different – it’s analogous to SQL Server Integration Services, which is an ETL management product. It loads source data into a repository, in this case, an embedded xVelocity, or Power Pivot model which can be considered a “personal data warehouse”. Queries against this mini data warehouse are fast, and don’t rely on SharePoint  retrieval mechanisms, and can be used quite effectively in reports.

There are a couple of subtleties to querying SharePoint list items with Power Query, and I will briefly walk through the process below.

With Excel open, click the Power Query tab, select “From Other Sources” and the select “From SharePoint List”.

image

Next, enter the URL for the SharePoint site (or subsite) that contains the list you wish to query.

image

If it is the first time accessing this site, you will be prompted for credentials. If your site is Office365, be sure to enter organizational credentials. If it is on premise, use Windows credentials.

Once entered, you will be presented with a list of SharePoint lists in the Power Query Navigator window. Select the list that you wish to query, in our case, Announcements. When selected, click the edit button to edit the query.

image

The data, or a subset will load into the query editor window. You will see all of the list item fields expressed as columns, and for the most part, using the correct data type. At this point you can remove any columns that are unnecessary, or filter any undesired rows. There are a couple of SharePoint field types that bear special mention.

Lookup fields are a lookup into another SharePoint list. Internally, the SharePoint item stores this as an ID and display value, but Power Query gives you access to all of the properties of the related item as a one-to-one relationship. Essentially, what you can do is to flatten that relationship by incorporating the related item’s attributes.

If you scroll to a column of this particular type, you will see the value expressed as a hyperlink with the value “Record”. Clicking on it will drill down to one related record, but that’s not what we want to do. We want to expand the properties for all items in the list. The way that you do this is  to click on the expand icon in the column header. In our case, we want to expand the “CreatedBy” field. CreatedBy is a standard list field, of the Person type. Person fields are actually a special case of a lookup field, so it exhibits this behaviour.

image

Here, we are interested in retrieving the user’s name and mobile phone, so we deselect all of the other fields. A new column will be created for every expanded field in the format sourcefieldname.attributename .

image

Attachments are another special case. There can be multiple attachments for a single list item, a one to many relationship. The hyperlink is therefore “Table”. Clicking on the column header expand for this column looks similar, but with an important difference. Options are available to either expand or aggregate the related items.

image

Selecting expand will create a new source record for each related item, and the only columns that will differ will be the items selected from the related table (Name in our case). Aggregate will not create any new records, but will summarize the related fields. For numeric fields, they can be totalled or averaged, and for text fields they can be counted.

Once ready, click “Close and Load” from the Query Editor ribbon, and the list data will load to either your model, or your workbook, depending on what your preferences are. Of course, I always recommend that you load to the model only.

Once loaded, any visualizations and queries will work against the model. The data can be refreshed at any point either manually, or automatically if using the Data Management Gateway. Keep in mind however that refreshes will operate against the source list.

17 Comments

Power BI Data Management Gateway 1.2 Changes the Game

Last week, a new version of the Power BI Data Management Gateway was released. If you’re unfamiliar with it, it is the Power BI component that allows for workbooks stored in the cloud to be refreshed on  a regular basis with data that exists on-premises, or outside of the hosting center.

I’ve been using the gateway since its initial availability in preview form, and in my opinion, this is the most significant functionality change yet. Until this release there were a grand total of three possible data source types that could be refreshed. With this release, the total increases to 18 by my count (you could argue 22, but that’s plenty).

With past versions, I would write up a quick post on how it is configured, but that has been done, along with the complete list of supported data sources and a helpful video on this blog post by the Power BI team. In addition, an very comprehensive (although amazingly already in need of update) white paper on hybrid data scenarios has just been published by Microsoft here.

The big change here is that this multitude of data sources is supported for data retrieved by Power Query, and NOT by Power Pivot natively. The catch is that they’re only supported for Power Query queries. There is absolutely nothing wrong with this, but it does require us to change our approach a bit to using the data management gateway.

As I first mentioned in a post almost a year ago on Using the Data Management Gateway, and in a number of posts since, data connection strings needed to line up with Power Pivot connections. At the time, the only supported sources were SQL Server and Oracle (for the gateway) and Power Query wasn’t supported at all. Version 1.1 of the Gateway brought Power Query support, but only for those 2 supported data sources. With this release, the Power Query support includes not only all of the new data sources, but also the three original Power Pivot data sources (note: Power Pivot data connections can be found in the Power Pivot add-in UI, while Power Query connections are available on the data tab in Excel).

image

image

As of this release of the Data Management Gateway, there is almost no reason to use native Power Pivot connections any longer. My recommendation is therefore that unless there is a good reason for not doing so, you should try to use Power Query for all data acquisition tasks. It is quite clearly the way forward, and will only gain in supported capabilities. My suspicion is that Power Pivot connections will be retained for backward compatibility reasons only.

With that said, there are a couple of good reasons for using Power Pivot connections directly. One of these reasons is if your data source is online, whether it is SQL Azure, SharePoint Online, or Project Server online. With these data sources, a Data Management Gateway is not required for refresh to work from an embedded Power Pivot connection. However, if Power Query is used to access these sources, it is.

What this means is that for Power Pivot connections to these sources, a refresh allows the Power BI service in the cloud to directly access these data sources in the cloud. However, because ALL Power Query connections require the Data Management Gateway for refresh operations, a Power Query refresh operation will require all of the data to be first downloaded to the on-premises gateway, and then sent back up to the Power BI service in the cloud. While functional, this is hardly the most efficient approach.

Apart from this one small caveat, this version of the Data Management Gateway spells the way forward. Additional data source support should come fast a furious, and the Power Query focus means that we can start to rely its powerful transformational capabilities without having to sacrifice refreshability (if that’s even a word….).

7 Comments

Changes to Data Loading Features in Power Query – May 2014

The Power BI team continues to deliver new features at a rapid pace. The May 2014 release of Power Query is no exception. There are quite a few new features packed into this release, which you can read about at your leisure but I’m particularly interested in the ones pertaining to data loading, as I’ve discussed several of the limitation in this area in the past.

This release is major indeed. There are three significant changes to the data loading features in this build.

Configurable defaults for data loading

I’ve just posted an article on how to do this and why. To put it simply, the default data load behaviour is to load data into a worksheet in most cases. This leads to larger than necessary workbooks, to the point where they may not work properly with Power BI. Now you can change this default behaviour, which will be welcome for anyone doing serious data modeling.

Worksheet Size Warning

If you do decide that you want to load data to the worksheet, or you’re simply unaware of the issues, you will be prompted to consider loading to the data model once your data hits 10 MB, which is the maximum non-model workbook size in Power BI. Prior to this update, the user wouldn’t know that there was a problem until after they tried to use the workbook in Office 365.

Data Model Preservation

Prior to this update, it wasn’t possible to modify the query without losing all changes to the data model, or formatting in worksheet tables. This release of Power Query remedies this situation. You can now go back and make changes to your query without having to recreate the model.

Power BI is being built in a surprisingly collaborative way, with a large amount of input from the community. I’ve never seen this done to this extent at Microsoft, and it’s very good to see. I know that all of these features have been asked for and discussed by the community in the last few months, and here they are. Kudos to everyone involved, and keep it up. This product keeps getting better and better.

1 Comment