Last week, a new version of the Power BI Data Management Gateway was released. If you’re unfamiliar with it, it is the Power BI component that allows for workbooks stored in the cloud to be refreshed on a regular basis with data that exists on-premises, or outside of the hosting center.
I’ve been using the gateway since its initial availability in preview form, and in my opinion, this is the most significant functionality change yet. Until this release there were a grand total of three possible data source types that could be refreshed. With this release, the total increases to 18 by my count (you could argue 22, but that’s plenty).
With past versions, I would write up a quick post on how it is configured, but that has been done, along with the complete list of supported data sources and a helpful video on this blog post by the Power BI team. In addition, an very comprehensive (although amazingly already in need of update) white paper on hybrid data scenarios has just been published by Microsoft here.
The big change here is that this multitude of data sources is supported for data retrieved by Power Query, and NOT by Power Pivot natively. The catch is that they’re only supported for Power Query queries. There is absolutely nothing wrong with this, but it does require us to change our approach a bit to using the data management gateway.
As I first mentioned in a post almost a year ago on Using the Data Management Gateway, and in a number of posts since, data connection strings needed to line up with Power Pivot connections. At the time, the only supported sources were SQL Server and Oracle (for the gateway) and Power Query wasn’t supported at all. Version 1.1 of the Gateway brought Power Query support, but only for those 2 supported data sources. With this release, the Power Query support includes not only all of the new data sources, but also the three original Power Pivot data sources (note: Power Pivot data connections can be found in the Power Pivot add-in UI, while Power Query connections are available on the data tab in Excel).
As of this release of the Data Management Gateway, there is almost no reason to use native Power Pivot connections any longer. My recommendation is therefore that unless there is a good reason for not doing so, you should try to use Power Query for all data acquisition tasks. It is quite clearly the way forward, and will only gain in supported capabilities. My suspicion is that Power Pivot connections will be retained for backward compatibility reasons only.
With that said, there are a couple of good reasons for using Power Pivot connections directly. One of these reasons is if your data source is online, whether it is SQL Azure, SharePoint Online, or Project Server online. With these data sources, a Data Management Gateway is not required for refresh to work from an embedded Power Pivot connection. However, if Power Query is used to access these sources, it is.
What this means is that for Power Pivot connections to these sources, a refresh allows the Power BI service in the cloud to directly access these data sources in the cloud. However, because ALL Power Query connections require the Data Management Gateway for refresh operations, a Power Query refresh operation will require all of the data to be first downloaded to the on-premises gateway, and then sent back up to the Power BI service in the cloud. While functional, this is hardly the most efficient approach.
Apart from this one small caveat, this version of the Data Management Gateway spells the way forward. Additional data source support should come fast a furious, and the Power Query focus means that we can start to rely its powerful transformational capabilities without having to sacrifice refreshability (if that’s even a word….).