Working with Excel Files in Power BI V2

In the beginning, Excel was at the center of Power BI. In the “V1” version of Power BI, the entire product was all about enabling analytics in the cloud, and Excel/Excel services was the delivery vehicle for those analytics. The authoring tools were all Excel add-ins, and the service revolved around updating data models in Excel, and allowing you to work with data models greater than 10 MB. Power BI “V2” changed that focus completely, completely removing that dependency on Office 365 and Excel. However, all of the Excel goodness that was within the “V1” product is still there, it just may have moved around a bit, and it works a little differently now.

In the early days of Power BI V1, I wrote up a post about the limitations of the product, specifically the file size limitations. Reviewing this article recently, I was actually struck by how little has changed. The maximum data model size remains 250 MB. In fact, 250 MB is the maximum size of any data model in Power BI, whether or not it originated in Excel. What has changed, rather drastically is the means of enabling this 250 MB limit for an Excel file, which we’ll get to below.

Let’s have a look at some of the major differences between Power BI “V2” and Power BI “V1” as it pertains to Excel.

More than just pretty face

In V1, Excel was almost always used as a means of presenting data. It was possible to use Excel as a data source, but the presentation of that data would inevitably be through another Excel file.

In V2, Excel is far more commonly used as a data source. The Power BI Designer, or the web interface can connect to Excel files, and then import the data into data models stored in the service. Native Power BI visualizations are then used for data presentation. These Excel files can be local, in OneDrive, or in OneDrive for Business. If the files are in OneDrive, or OneDrive for Business, the data models can be automatically refreshed when the source workbooks change. However, we can continue to work with Excel as a presentation mechanism.

If the Excel file is stored in OneDrive for Business (as opposed to OneDrive), you can connect the Power BI interface to the workbook in place. Once connected, the workbook can be viewed and interacted with through Excel Services right within the Power BI interface. To do this, from the dashboard, select “Get Data”, choose Files, select the OneDrive for Business option, select the desired workbook, and finally, select the “Connect” button”. You will then be presented with two options.

Importing data from Excel vs connecting to a workbook in place

Note that these two options will ONLY appear if the repository for the workbook is OneDrive for Business. Selecting “Import” will import the data contained in the file into a service based data model. This is the operation that will occur for all other repositories, and it uses Excel as a data source. Selecting “Connect” however connects to the workbook in place, and it adds the workbook as a report to the Power BI user interface.

Excel Workbook in the Power BI section

The workbook appears in the Reports section with a small Excel icon beside it. To view the workbook in Excel Services, click on the ellipsis to the right of the name, and select View. The workbook should load in a new window in full fidelity, and allow interaction.

No artifacts are created in the Datasets or Dashboards sections, the workbook is a self-contained unit. This is important because personal sharing can only be done through Dashboards. Therefore, Excel Services based reports can only be shared through Office 365 Groups.

Office 365 Groups

One of the biggest changes that Power BI “V1” users will notice (and need to deal with) is the fact that in order to work with Power BI in the same manner, Excel workbooks MUST be stored in a OneDrive for Business repository. With “V1”, a workbook could be stored within any SharePoint Online repository, and enabled for use with Power BI. Therefore, all “V1” users will need to move these workbooks into OneDrive repositories before the “V1” service is deprecate (Dec 31, 2015).

Every Office 365 user gets a OneDrive repository by default, and these repositories work just fine for personal use, but most current users that are using SharePoint Online will have workbooks in shared libraries in a collaboration environment. Office 365 Groups also each have their own OneDrive for Business repository, and all group members have access to that repository. Power BI V2 fully supports Groups, so this is the logical place to store all of the “V1” workbooks for collaboration purposes.

Connecting to a workbook in a Groups OneDrive is identical to the process above, but first, you need to navigate to the Group’s context in the Power BI UI. This is done by clicking on the My Workspace button (and not entirely obvious).

Groups Selector

Simply select the group and you will be working in that Group’s context.

Increased File Size

One of the biggest benefits of the original Power BI “V1” was the ability to work with workbooks that contained data models larger than 10 MB. The way that this was done was by “enabling” the workbook for Power BI as I outlined in this article. The enablement mechanism no longer exists, but the benefits are still there. The process of connecting a workbook outlined above intrinsically enables the workbook for Power BI and increases the maximum model size from 10 MB to 250 MB. In addition, the workbook can also be opened and interacted with directly from OneDrive. It is also possible to share that workbook with others in your organization, but in order to open those large workbooks in a browser, those users will need a Power BI licence.

Workbook Refresh

The ability to refresh workbooks stored in the cloud from data stored on-premises was, at the time of its introduction, the most important feature of Power BI. Each workbook would be enabled for refresh through an administrative interface, and if the data source matched a registered data source, the service would call an on-premises Data Management Gateway, which would facilitate the refresh of the workbook. This worked, but was somewhat difficult to get set up. Power BI “V2” has simplified the process tremendously. This does however mean that the process has changed.

Most refreshes are performed on a Dataset, but as mentioned above, an Excel report is self-contained, so the refresh options are available from the report itself. Simply click on the ellipsis to the right of the report and select “Schedule Refresh”.



Report refresh options

The first time this is done, you will need to enter the Data Source credentials. Once entered these will be used for subsequent refreshes. This will also need to be completed before the report can be refreshed on demand. You can also schedule the refresh time here.

If the data source is a supported cloud source, no further steps will be necessary. If it is on-premises, then it is necessary to install the Power BI Personal Gateway. This does not require administrative permissions to run (although it’s best if you do), nor does it require any special permission on the service side. The Personal Gateway is meant to be just that – personal and easy to use.

Hopefully this covers most of the major differences of working with Excel workbooks in V2 of the service. It looks like a big change, but most things are still possible, and some things significantly enhanced.

Advertisements

The New Power BI Personal Gateway – Do I Need It?

Last week, Microsoft released the Power BI Personal Gateway. The Personal Gateway lets you keep dashboards created in the new Power BI Dashboard service updated with data from your on-premises data sources. This is important – nobody wants to manually refresh data all of the time. However, the service already updates many data sources updated automatically – when is this tool necessary? Also, there is already a refresh tool available for Power BI called the Data Management Gateway – what’s the difference between these two tools, and when would I use one versus the other? This post is an attempt to answer these and a few other questions.

To set the stage, we need to distinguish between the original Power BI service (V1) and the Power BI service released on July 24 2015 (V2 or Power BI Dashboards). The V1 service runs (or ran, depending on when you read this) as an add-on to Office 365. Among other things, this service allows Excel files with embedded Power Pivot data models to be used from Office 365. The Data Management Gateway can be connected to the service to keep those workbooks refreshed with data on a periodic basis. The new V2 version of the service removes the dependency on Office 365 and Excel. It allows users to connect directly to their data, and to use Power Query, Power View, and (essentially) PowerPivot to transform it, visualize it, and create dashboards from it. In this new model. Office 365 is simply a repository for Excel files, which become a source for both data and reports, depending on how they are connected.

In addition to refresh capabilities, the new Power BI (V2) service supports direct querying of on premises and cloud data sources. This is significantly different than data refresh. In a live connection scenario, dashboard interactions are sent back to the data sources in real time where they are executed, and the visualizations are updated through the service in real time accordingly. In a refresh scenario, a data model that exists in the Power BI service is updated from a source on a periodic basis. This refresh has been the job of the Data Management Gateway, and is also the job of the new Personal Gateway.

Architecturally, the two services can be viewed as follows:

With this in mind, let’s answer a few anticipated questions

Will I need the Power BI Personal Gateway to do live query of on premises data?

No. The Personal Gateway, like the DMG, performs a data refresh of a model that is stored within the Power BI service. Live queries are executed against on-premises data models, so in this scenario, the Personal Gateway plays no part.

If I have the DMG, do I need the Personal Gateway?

The answer to this is that it depends. Although related, the two products do different things. The DMG is responsible for keeping the data models contained within Excel workbooks and stored in SharePoint online up to date. The refresh process in this case is the equivalent of opening the Excel workbook, selecting the Refresh All Connections button, saving it back and allowing he service to update the model stored in the service. The Personal Gateway has no workbook to update, it only updates the service based model. Therefore, if you do need to keep workbooks refreshed in Office 365, you will need to use the DMG. However, if instead you upload your workbook to the new “V2” service, you will need to use the new Personal Gateway.

Can I install the Personal Gateway and the DMG on the same machine?

No. The Personal Gateway is really an evolution of the original DMG and uses the same underlying code base. The two are incompatible and cannot be installed on the same machine. An attempt to do so will result in the following error:

If I have the SSAS Connector, do I still need the Personal Gateway to refresh data?

Yes, The SSAS Connector is a service that is installed on-premises to allow the Power BI service to perform live queries on SSAS servers. In order to keep data in a Power BI model up to date from an on-premises data source, the Personal Gateway is necessary. However, it is not currently possible to install both the Personal Gateway and the SSAS Connector on the same machine. In fact, if you attempt to do so, you will receive precisely the same error as above. The SSAS Connector is another variant of the original DMG.

Do I need to use Power BI Designer to create a refreshable model in Power BI “V2”?

No. While Power BI designer is one tool for doing this, it is not the only one. Models refreshable from on-premises data can also be created by using the Power BI user interface and connecting to Excel workbooks.

Will any data model created in Excel or Power BI Designer work with the Personal Gateway?

(note – this answer has been updated from it’s original to correct some inaccuracies. Thanks to Derek Rickard for pointing this out)

No. In order for a model to be refreshable by the Personal Gateway, it must have been created from a refreshable data source. This is a similar to the DMG which could also refresh some direct on premises data sources, but the difference is that Power Query was required to refresh anything but SQL Server or Oracle data sources. In Excel, a model can be created using PowerPivot, Power Query, or by the selection of appropriate options when importing data.

The following data sources are currently supported.

  • SQL Server
  • Oracle
  • Teradata
  • IBM DB2
  • PostgreSQL
  • Sybase
  • MySQL
  • SharePoint List
  • File (CSV, XML, Text, Excel, Access)
  • SQL Server Analysis Services Tabular models
  • Folder
  • Custom SQL/native SQL

Do I need the Personal Gateway to refresh data sources from the cloud?

No. As with Power BI “V1”, cloud based data sources can be refreshed directly from the service, with no need for a gateway. However, if your model contains data sources from both on premises and the cloud, a gateway will obviously be required. Also, as mentioned above, Power Query must have been used to acquire the data. Supported cloud data sources are:

  • Azure SQL Database
  • Azure Blob Storage
  • Azure Table Storage
  • Azure HDInsight
  • Azure Marketplace
  • Dynamics CRM Online
  • Facebook
  • Google Analytics
  • Salesforce Objects/Reports
  • OData feeds
  • Web (HTML & Web APIs)

Do I need to be an Administrator to run the Personal Gateway?

No. This is a major departure from the DMG. The DMG installed as a service, which requires administrator level permissions to do. In addition, Configuration of data sources at the service level required special permissions. The DMG was designed to be run by administrators. The new personal BI “V2” is designed to meet the needs of both individual users, and enterprises, and correspondingly, the Personal Gateway can be run by anyone. I suppose that the word “personal” in the name should be a bit of a hint.

At install time, the system is interrogated to determine the current user’s permissions. If the permissions are sufficient, the Personal Gateway installs itself as a service, allowing full unattended operation. If permissions are insufficient, the gateway installs itself as an application. When installed in this manner, the application must be running in order for any refreshes to occur. Obviously, the user must also be logged in.

 

I’ll add more Q&A to this post as needed over the coming weeks. The coming release of the new Power BI service promises to be exciting. For more details, check out the Personal Gateway release announcement.

Power BI Data Management Gateway 1.4 – Where is it heading?

I received a notice from my main Power BI tenant last night that a new version of the Data Management Gateway was available. The previous (1.2) version contained some very significant changes so I was understandably eager to have a look. I installed it, observed a relatively attractive setup interface, then opened the release notes to find out what else was new. Only four items were listed (from the release notes).

  • Unified binary that supports both Microsoft Azure Data Factory and Office 365 Power BI services
  • Refine the Configuration UI and registration process
  • Azure Data Factory – Azure Ingress and Egress support for SQL Server data source
  • Office 365 Power BI – Bug fixes

I had already observed number two, the new setup experience. Bug fixes, while absolutely necessary, aren’t necessarily something to write about, but I think that the other two items are. While they may not have immediate impact, my bet is that they will in very short order.

The key point here is that the gateway now supports the Azure Data Factory. There are many, many things that the data factory enables (Hadoop anyone), but the one that I feel is most relevant to Power BI today is the ability to connect directly to on premises data sources. That’s not quite how it’s been done until now.

Power BI for Office 365

In the context of Power BI as we’ve come to know it today, on-prem data refreshes are handled by the Data Management Gateway. On a periodic basis (daily at most) the service contacts the gateway, which in turn reruns all relevant queries. The resultant data is then uploaded to the service.

The service in turn packages the data and updates the host Excel workbook, and the model is transferred into a back end analysis server. Every transaction goes through the host Excel workbook.

Power BI Dashboards

If you’ve had a chance to see the preview of Power BI Dashboards, you may have noticed that it is not dependent on Office 365 or Excel at all. When you add a data source, you take the date and add it to a cloud based data model directly (presumably backed by SQL Server Analysis Services). All visualization work against these models, with one very important exception. If you connect to a SQL Server Analysis Services Data source you are actually connecting directly to a model hosted on an on-prem SSAS server in real time.

SNAGHTML5339485

How is this done? The connection is made through the “Analysis Services Connector”, which is a separate bit of software installed on prem to facilitate connection between the Power BI Dashboards service and the On-Prem SSAS server. It’s available directly from the dashboards portal.

image

After installing it, a process that establishes for dashboard and SSAS credentials, it can be reconfigured by running the “Power BI Analysis Services Connector” tool.

image

However, installation also adds another piece of software to the host machine. The Microsoft Data Management Gateway. This version of the DMG establishes the connection between the SSAS server and the Power BI service in real time. Up until now, the DMG didn’t work this way, so which version is it?

image

Until now, the most recent version of the DMG was 1.2, so this Dashboards preview contained a glimpse into the next generation Data Management Gateway that provided some intriguing new capabilities.

Coming Together

Checking into the latest version of the Data Management Gateway from Office 365, we see:

image

This version is newer that that included in the Dashboards Preview, and presumably includes everything from it. The key phrase in the release notes to me is therefore “Unified Binary”. One gateway to rule them all, if you will. Does this mean that we’ll be able to connect to on-prem data in real time from Office 365 as well as from the Power BI preview? I don’t know how, but I bet that the building blocks are now there.

The latest version may not include support for any new data sources, or any new bells and whistles, but it’s likely worth setting up for new capabilities that will hopefully show up sooner rather than later.

Limitations for Power Query OData Feeds in Power BI

One of the features available in Power BI is the ability to take any defined data source and expose it as an OData feed. This is a very simple and quick way to get your existing data available through OData, as it involves a simple check box selection. Complete instructions on setting this up can be found here. There are however a few limitations to what you are able to do that you should be aware of before you head down this path.

Intranet Only

The OData feed feature works through the Data Management Gateway, which is normally used to keep data models stored in the cloud updated regularly with new on-premises data. When a data source is registered, an “enable OData feed” option is made available which when checked, creates an OData feed URL.

When this feed is used, a connection is made directly from the OData client to the Power BI service, which then redirects communication to the Data Management Gateway. The reason that this is important is that because the actual data connection does not go through the Power BI service, the client machine needs to be able to communicate directly with the machine hosting the Data Management Gateway. This means that the OData feed only works on the intranet – it can’t be shared publicly. For now at least.

Data Types

The Data Management Gateway, and therefore the Power BI service don’t support all of the data types supported by SQL Server, or Oracle. If your table or view uses any field that is an unsupported data source, the entire table will be unavailable to use in an OData feed. The table will appear as greyed out when the list of tables to use for OData is being configured.

image

In the above case, the DistrictMaps table contained a geography field, which is unsupported. A complete list of supported data types can be found here. If you are using unsupported data types, you may want to consider creating views that do not contain these fields, and exposing those.

Data Sources

Up until recently (version 1.2), the Data Management Gateway only supported performing data refreshes from two on-premises data sources – SQL Server and Oracle, which constrained its value somewhat. Version 1.2 brought support for a wide variety of Power Query data sources, which really changed the game. Now, since OData feeds utilize the Data Management Gateway, we should be able to expose all sorts of data sources as OData feeds, right?

Wrong. Well, not quite at least. I received a question from Hrvoje Kusulja,  who was trying to expose DB2 data as an OData feed through Power Query, but the OData feed option was disabled. After some testing, and communication with Microsoft, I was able to determine conclusively that while Power Query queries are supported for OData feeds, the underlying Power Query queries MUST come from either SQL Server or Oracle. This is identical to the Power Query refresh support in version 1.1 of the Data Management Gateway. Unfortunately we couldn’t find documentation on this anywhere.

One potential workaround if you need OData support and your data source isn’t supported would be to use an ETL system (Integration Services) to pump data into SQL or Oracle, and create the query from there.

OData feeds is a great little feature, and a nice side benefit from using Power BI and the Data Management Gateway. As with any new product, it has limits that will undoubtedly be reduced in the future, but it’s important to know where they are.

Power BI as a Product Today

Recently, I have come across several situations where people are confused about where Power BI fits in a solution scenario. There is a fair bit of confusion as to precisely what the product is and what it does. The problem is that Power BI isn’t really a product at all, but instead a collection of different products and services. Adding to the confusion is the fact that some of these products require a Power BI license, while others do not. In fact some of these products are actually embedded in other products.

Power BI is Microsoft’s cloud based Business Intelligence solution billed as “Self service analytics for all of your data”. In reality, it’s a little more than self service, it also is a great solution for team BI as it’s based on Office 365. That’s all well and good, but what is it really? What does it consist of, and how does it work? If you look at the main product site for Power BI, it’s not immediately obvious at what you get when you purchase it, or what you need to run it. This post is an attempt to demystify the product.

To start, let’s break it down by its constituent components. Today Power BI consists of the following parts.

image

Unfortunately, this can be rather confusing from a product perspective. Looking first at the on-premises components, Power Query, Power View, and Power Map are all Excel plug ins. Excel is therefore a prerequisite for Power BI. All of these add ins also require (or in the case of Power Query, support) the embedded xVelocity data model, and therefore Power Pivot is a prerequisite. Power Pivot is included in Excel 2013 (Professional Plus), but it can also be downloaded for free for Excel 2010.

Also included in Excel 2013 is Power View, and, with Office 2013 SP1, Power Map. Power Query is downloaded separately, but is free. This is where much of the confusion arises. Due to the fact that these three add ins are included in the product definition of Power BI, it is often assumed that a Power BI license is required to use them. It is not. These products have a life of their own, and can be fully (or almost fully) used within Excel without any association with a Power BI license.

Power Query contains a few features that will only work with a Power BI tenant, mostly involved around the creation and maintenance of shared queries. Since this is part of the cloud service, this makes complete sense, but none of the other features of the product are in any way reduced in the absence of a license. Power View is enhanced through a Power BI license, but only because this makes Power View reports available within the mobile client(s). Indeed, Power Map has no use whatsoever of a Power BI license. Power Maps cannot be viewed at all within a browser – they are a client side feature only. In my opinion, they shouldn’t even be included under the Power BI umbrella, but that’s just my opinion.

Thus far, I have been talking about the modelling and visualization creation aspects of the tools, but what about pure consumption clients? The whole idea of power BI is that designers can create these models and users can interact with them. The workbooks containing these models are stored within Office 365, so do casual users need a license?

The answer is of course maybe. If these users are going to take advantage of any of the services specifically offered by Power BI, then the answer is yes. For example, any user can open a workbook in a browser in Office 365. However, if they want to interact with that model, by using a slicer, pivot table, etc, and that model is larger than 10 MB, then the answer is yes. Obviously, if the user wants to use the Power Q&A features, then the answer is also yes.

For the record, I don’t like this answer. To my mind, designers and content creators should require a license, but consumers should not. This would greatly encourage adoption of the product, so I do hope for some changes in this area.

So, precisely what do you get when you purchase a Power BI license? These are the things that you will absolutely need a Power BI license for.

  • Opening workbooks in a browser with models larger than 30 MB on Office 365
  • Interacting with (slicers, pivot tables, etc) workbooks in a browser with models larger that 10 MB on Office 365
  • Automatic refresh of on premises data
  • Sharing of Power Query queries
  • Refresh of Power Query queries
  • Power Q&A – Natural language queries
  • Power BI mobile application

and that’s it.

In fact, if you check out my earlier article “Whither Power Pivot for SharePoint”, you’ll see that many of the features of Power BI are already available in Power Pivot for SharePoint.

To my mind, the product “Power BI” should not include the Excel add ins, but only list them as a requirement, much like Excel itself is a requirement. This would help to reduce confusion. The next version of Power will support their inclusion. If you’re interested in this new version, you can sign up for the preview when it’s ready here. I’ll be writing more about that shortly.