Tag Archives: Excel

The Difference Between Reporting and Analytics is 42

In his novel “The Hitchhiker’s Guide to the Galaxy”, Douglas Adams envisioned a giant supercomputer named “Deep Thought” that was built to solve the answer to the ultimate question of life, the universe and everything. For the 5 people out there that are unfamiliar with the story, I’ll relate the important bits here. Deep Thought was commissioned by a race of pan-dimensional beings and required seven and a half million years to complete its calculations. When it was finally complete, Deep Thought informed the ancestors of the original creators that the answer was 42. The receivers were understandably disappointed with this response, and when they questioned Deep Thought further, the computer postulated that perhaps the problem was that they never really knew what the question was.

Undeterred, the race then commissioned a second computer (which happened to be the Earth) that would calculate the ultimate question. After a couple of 10 million year attempts, the ultimate question was determined to be “What do you get when you multiply six by nine”. Of course, Adams never claimed that the universe made sense.

To my mind, this is an excellent demonstration of the difference between reporting and analytics. The accurate answer (report) provided a result, but not meaning. Further analytics were necessary to determine context.

Like many information technology terms (Big Data, machine learning, CRM) Business Intelligence (BI) is one of those umbrella terms that many people use regularly without fully understanding its meaning. BI is comprised of many tools that help to glean information and insights from raw data. Thus, an ETL package that moves data from one location to another is just as much a BI tool as is a fancy looking infographic. Combine this lack of clarity with the overloading of the term “reporting, and we wind up with some real confusion in this space.

Reporting is the process of using data to highlight things or trends that have already happened. This can be contrasted with monitoring, which does the same for things that are happening now, and predictive analytics, which tries to predict what will happen in the future based on the same data. The difference between reporting and monitoring is only one of data latency, and as such, monitoring is often referred to as real time reporting, which further muddies the water. However, for the purposes of this article, I want to focus on historical reporting.

Reports are typically one of two types, either operational or analytical. Tools that are good at producing one type are typically not so good at producing the other. What’s the difference? Operational reports are designed to provide information that we know we need, and analytical reports are designed to help us discover things that we didn’t know, or to help answer unanticipated questions. Operational reports are typically designed to be printed. They are typically well paginated, pixel perfect, and provide a single view of the data within any given report. Analytical reports are just the opposite. They are designed with visuals as a starting point, but allow for the ability to pivot on or drill down into the data as appropriate to answer ad-hoc questions. Printing is typically a weakness for analytical reports, whereas drilldown is a weakness for operational reports.

Both report types have their place but they both have very different design point. The data that backs an operational report should ideally be relatively flat, as that best reflects the report layout and helps with performance. Conversely, cubes and data models exist simply because a flat data structure does not adequately support analytical reporting. With analytical reporting, a user may at any point decide to view quantitative data (a measure) through the lens of a different facet (dimension). This difference is so great, that we need a different type of engine to support it. OLAP cubes and tabular models are both examples of this.

Another difference is the data that is necessary to support both report types. Operational reports tend to concern themselves with various levels of subtotals per the predefined facets. In a case like that, the data mart that backs the report only needs to store those subtotals. The granularity, or resolution of the data stored in the data mart does not need to exceed that of report that references it. Analytical reporting is different. Since users will be expected to drill down on data, from on dimension to another, or to filter the data according to increasingly granular facets, it is critical to store all of the data in the data mart backing the data model. We don’t know the level of resolution the analyst will need; therefore, all detail is required.

As a simple example of this, consider the case where we want to analyze some server log data over a period of time. We can pre-aggregate the data in the data model such that it stores the total of the log entries of various entries on a daily basis. There would need to be a total based on each dimension, but the overall data storage would be less than for the raw data. Such data would allow an analyst to spot trends over several days, but the decrease in resolution means that it will be impossible to spot any usage trends within a given day. If daily trends will never be necessary, then this doesn’t matter, but the nature of analytical reports means that the designer can never be sure.

The more that the source data for the report is pre-aggregated, the less that report becomes analytical in nature, and the more it approaches operational. This is regardless of the tool used; you can build either report type with any tool, it’s just that it may not be optimal.

The issue here is one of semantics. Semantics however are important in knowing what you are getting if reports are being provided to you. Calling something “Analytics” does not make it so. If you spin up a content pack in Power BI, and find that the underlying data model provides just enough dimensions and measure to construct the provided report, and that you can’t deconstruct the data in any meaningful way, what you have is a report, not analytics, no matter what the platform. As with anything, there is a trade-off between complexity and power. Given the nuances of this topic, it’s important to look under the hood to know what you are getting.

The answer “42” is perfectly acceptable if you already knew that the question was “what is 6×9?”. But if you want to know why, that takes a little more digging. You’d also know that there might be a data problem…

Power BI Analyze in Excel – The beginning of a beautiful thing?

One of the announcements made at the Microsoft Data Summit in this past March 2016 was the availability of Analyze in Excel. This feature allows an Excel workbook to connect to a data model that is stored in the Power BI service, and to use it to analyze the data contained within. With this approach Excel is not importing data, or at least it is not importing any more data than the query results. It is exactly like connecting data to SQL Server Analysis Services data sources, something that Excel users have been doing for years. Well, to be completely accurate, it’s not LIKE connecting to SSAS, it IS connecting to SSAS. The only difference is that in this case SSAS is in the cloud. This feature significantly enhances the utility of the Power BI Service, and is important for several reasons that may not be all that obvious. I’d like to walk through a few of them, but let’s start with the obvious.

Excel is a very powerful analytical tool

As nice, and as attractive as Power BI visuals are, Excel still rules the roost when it comes to doing advanced analytics. Excel has been doing this for years and is very mature. It supports features such as pivot tables, pivot charts, and drill through to data, where Power BI reports still do not. The lack of these features can be a blocker for Power BI on its own, but if the data models in Power BI can be analyzed with Excel, suddenly a move to Power BI is not an either/or decision – you can have it both ways. You can deploy models and reports into Power BI and take advantage of all the goodness there, but you can also connect with Excel when the deep analysis is needed. With Analyze in Excel, you can have it both ways.

A wider audience for your data models

Very often, the person that builds the data model is the same person that does the analysis. This is the nature of self-service analytics. In the past when the only multidimensional analysis tools were OLAP cubes and connected Excel workbooks, cube design was a specialized skill. The cubes were published and users would use them as is. The advent of the data model (Power Pivot) and self service analytics lowered the skills bar so that analysts could acquire data, model it and analyze it, making the entire process much quicker and responsive. However, this still puts the model out of reach for those with no data modeling skills or interest.

Analyze in Excel provides the best of both worlds. Analysts can build models and reports in Power BI, and users that need more in depth analysis can connect to them with Excel without having to reinvent the wheel. This in effect provides the same capability that PowerPivot for SharePoint does on premises. One model can now reach a much wider audience of users. This has many of the benefits of an on-premises SSAS deployments without all of the organizational overhead of getting them up and running.

Uses the Analysis Services OLEDB Driver

The connection that is made from Excel to the Power BI services uses the latest version of the classic Analysis Services OLEDB driver. This is the driver that has always been used to allow Excel (and other tools) to communicate with Analysis Services, and this new version has been updated in order to work with the cloud based SSAS service. In fact, in order to use the feature, you must first download and install the updated driver. Therefore, in theory, any tool that uses this driver should be able to communicate with Power BI models as if Power BI was one great big SQL Server Analysis Services server (because it is).

It really is Analysis Services in the cloud

The Power BI service itself is backed by tabular mode SSAS. Until now, it was necessary to go through the service to access it. Analyze in Excel is the first instance that I know of that a client application communicating directly with that SSAS instance. While this connection is really using the Power BI API, it does beg the question – can a fully Platform as a Service version of Analysis Services be very far away?

Claims based authentication and Power BI API

None of the products in the SQL Server suite currently supports claims authentication. This is true even for the yet unreleased SQL Server 2016. Even SQL Azure, a cloud based version of SQL Server, requires SQL authentication only (although Azure Active Directory authenticated databases are currently in preview). However, looking at the connection string contained in the ODC file used by the Analyze in Excel feature reveals some interesting things. Here’s one connection string:

<odc:ConnectionString>Provider=MSOLAP.7;Integrated Security=ClaimsToken;Identity Provider=AAD;Data Source=https://analysis.windows.net/powerbi/api;;Initial Catalog=xxxxxxx; ……..

The value MSOLAP.7 for the provider indicates that this is the next version of the SSAS OLEDB Driver. No surprises there, but this does hint at future compatibility (see SharePoint below). The value for Integrated Security, and Identity Provider (ClaimsToken and AAD) indicate that it is leveraging Azure Active Directory Claims authentication. We therefore have a version of SSAS that can use Claims based authentication. This isn’t available to on-premises installations, but given that the capability has been built, I imagine that it is not all that far away.

Finally, the Data source indicates that the Power BI API is being used to marshal all communication with the back end API service. I think that it is reasonably to conclude that any API for a PaaS based version of SSAS would be based on, or strongly resemble the Power BI API. They may even be one and the same.

Excel Online in SharePoint

As anyone that has set up PowerPivot for SharePoint can tell you, SharePoint supports the configuration of new OLEDB drivers. This support carries forward into the Office Online Server in the world of SharePoint 2016. Given that both SharePoint and OOS utilize claims based authentication, it should theoretically possible to create a workbook that uses the Analyze in Excel feature, store it in SharePoint, and have it work for multiple users from within a browser. I imagine that more plumbing is needed at this point, but it would be an interesting way of integrating Power BI in the cloud with SharePoint both on premises and Online.

Reusing the Excel Files, and Limitations

In the same vein as discussed with SharePoint, Power BI itself allows Excel files to be interacted with in the service in exactly the same manner that Excel Online does. Theoretically, one should be able to use Analyze in Excel to build a workbook, then connect it to Power BI and have it work for interaction. While it is possible to connect it, all interactions fail at the moment. It appears that the Power BI service (or the backing Office Online service) does not yet support the new OLEDB driver.

Another current limitation of this feature is that data sources using Direct Query (this includes SSAS sources) or sources created by the Power BI API cannot be used with Analyze in Excel. At least not yet.

Analyze in Excel is another useful tool in the Power BI arsenal, but as outlined above, I think that it’s a harbinger of even greater things to come.

Analyze in Excel – Setup and initial issues

Analyze in Excel is an extremely important new Power BI feature for reasons that I’ll outline in more depth in an upcoming post. If you want to try this feature for yourself, there’s a good possibility that you’ll hit one of the errors that I ran into. This post will hopefully help other intrepid pioneers past the hurdles and get working with this fantastic new feature.

Initialization of the Data Source failed

In order to enable the Analyze in Excel feature, Excel must use the Microsoft AS OLE DB Provider from SQL Server 2016. This driver supports claims based authentication for SSAS, which is what Power BI uses. Observant readers will notice that I just used “claims based authentication” and “SSAS” in the same sentence, but I digress. What’s that you say? SQL Server 2016 has not yet reached RTM? That’s OK because you can use the driver from the Release Candidates, which can be obtained from the SQL Server 2016 Feature Pack directly. More commonly though, you’ll get it from Power BI user interface. There it can be acquired in one of two ways.

You can proactively download the updates from the Power BI “Download Center”, which is the little down arrow in the upper right of the toolbar:

Clicking the “Analyze in Excel updates” button will download the installer for the driver, which you can run right away.

Running the file takes you through the installation of the driver, which is simple and wizard driven.

Once installed, you may then navigate to a data source and launch “Analyze in Excel”.

At this point you will be prompted for two things. Firstly, you will be prompted to download and launch an ODC. This is the connection file that will be used to connect your Excel client to the SSAS service that your data source is housed in. If you’ve already installed the new driver, you can just go ahead and run it, and Excel will launch. If you haven’t you can take advantage of the second (simultaneous) prompt, which is to download the driver. This prompt will appear whether or not you have already installed it.

This approach is a bit different. This dialog has a big yellow box that just begs to be clicked, and it will install the 32-bit version of the driver. You can also take the high road and install the 64-bit version which of course you’ll need if your Excel is 64 bit. After all, anybody serious about doing data in Excel is using 64 bit Excel, right? In any event that’s the difference between the second approach and the first. The first approach does not give the option, it just goes ahead and uses the 32-bit version.

I of course originally opted for the first option, and whenever the ODC launched, and Excel opened, I received the error “Initialization of the Data Source failed”, which is hardly intuitive.

The issue of course is a mismatch between the bit level of the driver, and the bit level of Excel. The solution to this problem is quite simple. The 32-bit version needs to be removed, and the 64-bit version installed. You’ll find the offending package in listed in Programs and Features as “Microsoft AS OLE DB Provider for SQL Server 2016”, but it makes no mention of the bit level.

If you happen to have both installed (you can) the only way to tell the difference between the two is that the 64-bit version is that the 64-bit version is about twice as large. This distinction also holds for the installer files – they are named the same but the 32-bit version is approximately 29 MB, and the 64-bit version weighs in around 62 MB.

Excel cannot find OLAP cube Model

About a week ago I completed a small IoT project that takes data from several weather stations and pumps it into a number of sources, one of them being Power BI. Writing data directly to a Power BI data model through the API (which is what Azure Streaming Analytics does) introduces a number of idiosyncrasies into the mix, one of them being that the data can only be updated from the API. Apparently another is that you can’t use the Analyze in Excel feature with it. The error that you get when you try to do so is:

You also cannot use Analyze in Excel with any Direct Query data sources or on-premises SSAS through a gateway. The solution to this is apparently patience – it’s currently not supported. I can only presume that it is coming soon, but for now, you can stop beating your head against the wall, it won’t work.

The HTTP server returned the following error: Forbidden

This one sounds pretty ominous, doesn’t it? It also doesn’t make sense. You needed to be logged in in order to find the “Analyze in Excel” button in the first place, but this error is indicating that you don’t have access. This problem occurs when you have multiple AAD (Organizational) or Microsoft accounts, and you are maintaining a connection to one that does not have access to the data source. It’s particularly galling, and difficult to remedy if you have an AAD account and a Microsoft account that use exactly the same email address. I know, because I do. In order to fix this, you need to force the connection to logout, but the only way to do that currently is to modify the ODC file.

An ODC file is simply an XML file that can be edited with any text editor. Once open, you search for the connection string (<odc:ConnectionString> and add the desired user ID to it by adding “User ID=account” as in the example below:

Launching it after the edit will force the logout of the previous connection and you should be presented with a login screen for the correct one. Unfortunately, in the case like mine where the two accounts are named identically, this does not work. What you must do is to use a different account first. This will force the logout. Once that is done, you can add the correct account, or remove the User Id section altogether. Subsequent launched will force the login, where you can choose the correct login type, and the feature will work.

These few tips should help you get up and running with Analyze in Excel, and shortly I’ll be discussing the reasons that I think this feature is such a big deal.

Working with Excel Files in Power BI V2

In the beginning, Excel was at the center of Power BI. In the “V1” version of Power BI, the entire product was all about enabling analytics in the cloud, and Excel/Excel services was the delivery vehicle for those analytics. The authoring tools were all Excel add-ins, and the service revolved around updating data models in Excel, and allowing you to work with data models greater than 10 MB. Power BI “V2” changed that focus completely, completely removing that dependency on Office 365 and Excel. However, all of the Excel goodness that was within the “V1” product is still there, it just may have moved around a bit, and it works a little differently now.

In the early days of Power BI V1, I wrote up a post about the limitations of the product, specifically the file size limitations. Reviewing this article recently, I was actually struck by how little has changed. The maximum data model size remains 250 MB. In fact, 250 MB is the maximum size of any data model in Power BI, whether or not it originated in Excel. What has changed, rather drastically is the means of enabling this 250 MB limit for an Excel file, which we’ll get to below.

Let’s have a look at some of the major differences between Power BI “V2” and Power BI “V1” as it pertains to Excel.

More than just pretty face

In V1, Excel was almost always used as a means of presenting data. It was possible to use Excel as a data source, but the presentation of that data would inevitably be through another Excel file.

In V2, Excel is far more commonly used as a data source. The Power BI Designer, or the web interface can connect to Excel files, and then import the data into data models stored in the service. Native Power BI visualizations are then used for data presentation. These Excel files can be local, in OneDrive, or in OneDrive for Business. If the files are in OneDrive, or OneDrive for Business, the data models can be automatically refreshed when the source workbooks change. However, we can continue to work with Excel as a presentation mechanism.

If the Excel file is stored in OneDrive for Business (as opposed to OneDrive), you can connect the Power BI interface to the workbook in place. Once connected, the workbook can be viewed and interacted with through Excel Services right within the Power BI interface. To do this, from the dashboard, select “Get Data”, choose Files, select the OneDrive for Business option, select the desired workbook, and finally, select the “Connect” button”. You will then be presented with two options.

Importing data from Excel vs connecting to a workbook in place

Note that these two options will ONLY appear if the repository for the workbook is OneDrive for Business. Selecting “Import” will import the data contained in the file into a service based data model. This is the operation that will occur for all other repositories, and it uses Excel as a data source. Selecting “Connect” however connects to the workbook in place, and it adds the workbook as a report to the Power BI user interface.

Excel Workbook in the Power BI section

The workbook appears in the Reports section with a small Excel icon beside it. To view the workbook in Excel Services, click on the ellipsis to the right of the name, and select View. The workbook should load in a new window in full fidelity, and allow interaction.

No artifacts are created in the Datasets or Dashboards sections, the workbook is a self-contained unit. This is important because personal sharing can only be done through Dashboards. Therefore, Excel Services based reports can only be shared through Office 365 Groups.

Office 365 Groups

One of the biggest changes that Power BI “V1” users will notice (and need to deal with) is the fact that in order to work with Power BI in the same manner, Excel workbooks MUST be stored in a OneDrive for Business repository. With “V1”, a workbook could be stored within any SharePoint Online repository, and enabled for use with Power BI. Therefore, all “V1” users will need to move these workbooks into OneDrive repositories before the “V1” service is deprecate (Dec 31, 2015).

Every Office 365 user gets a OneDrive repository by default, and these repositories work just fine for personal use, but most current users that are using SharePoint Online will have workbooks in shared libraries in a collaboration environment. Office 365 Groups also each have their own OneDrive for Business repository, and all group members have access to that repository. Power BI V2 fully supports Groups, so this is the logical place to store all of the “V1” workbooks for collaboration purposes.

Connecting to a workbook in a Groups OneDrive is identical to the process above, but first, you need to navigate to the Group’s context in the Power BI UI. This is done by clicking on the My Workspace button (and not entirely obvious).

Groups Selector

Simply select the group and you will be working in that Group’s context.

Increased File Size

One of the biggest benefits of the original Power BI “V1” was the ability to work with workbooks that contained data models larger than 10 MB. The way that this was done was by “enabling” the workbook for Power BI as I outlined in this article. The enablement mechanism no longer exists, but the benefits are still there. The process of connecting a workbook outlined above intrinsically enables the workbook for Power BI and increases the maximum model size from 10 MB to 250 MB. In addition, the workbook can also be opened and interacted with directly from OneDrive. It is also possible to share that workbook with others in your organization, but in order to open those large workbooks in a browser, those users will need a Power BI licence.

Workbook Refresh

The ability to refresh workbooks stored in the cloud from data stored on-premises was, at the time of its introduction, the most important feature of Power BI. Each workbook would be enabled for refresh through an administrative interface, and if the data source matched a registered data source, the service would call an on-premises Data Management Gateway, which would facilitate the refresh of the workbook. This worked, but was somewhat difficult to get set up. Power BI “V2” has simplified the process tremendously. This does however mean that the process has changed.

Most refreshes are performed on a Dataset, but as mentioned above, an Excel report is self-contained, so the refresh options are available from the report itself. Simply click on the ellipsis to the right of the report and select “Schedule Refresh”.



Report refresh options

The first time this is done, you will need to enter the Data Source credentials. Once entered these will be used for subsequent refreshes. This will also need to be completed before the report can be refreshed on demand. You can also schedule the refresh time here.

If the data source is a supported cloud source, no further steps will be necessary. If it is on-premises, then it is necessary to install the Power BI Personal Gateway. This does not require administrative permissions to run (although it’s best if you do), nor does it require any special permission on the service side. The Personal Gateway is meant to be just that – personal and easy to use.

Hopefully this covers most of the major differences of working with Excel workbooks in V2 of the service. It looks like a big change, but most things are still possible, and some things significantly enhanced.

Where Did Power View Go in Excel 2016?

If you’ve been using the Excel 2016 Preview or just Excel 2016 (depending on when you read this), you may have noticed that there is no longer an option to insert a Power View report into o workbook. The reason is that it has been removed from the default ribbon in Excel 2016. It used to be on the Insert tab in the Reports Section, right beside Power Map.

Power View in Excel 2013

However, opening the Insert tab in Excel 2016 reveals it to be missing.

Power View Missing in Excel 2016

Did Microsoft remove Power View from Excel? What’s going on? Power View is still very much a part of Excel; the only change is that now it is no longer a default ribbon option. The good news is that it’s simple enough to add it back in. To do so, we need to edit the ribbon. Click on File-Options, and then select Advanced Options. The ribbon editor will appear. We can add Power View to any tab that we would like, or even create a new one, but here we’re just going to add it back to the Insert menu. To do so, expand the Insert menu. Each command must be added to a group, so we need to click the “New Group” button. Next, because I don’t think anyone will want their group named “New Group”, we want to rename it. In this case, we’ll rename it to “Reports”, the way that it used to be.

Adding a new group to the Insert tab

Next, we need to add Power View into the group. The easiest way to do this is to select “Commands Not in the Ribbon” from the “Choose commands from” dropdown. It’s a long list of items to choose from, and you’ll be tempted to look under “P” for Power View. You will be disappointed. The correct command is actually to be found in the “I”s, and it is “Insert a Power View Report”. Select that option, and click the “Add” button.

Once this is complete, Power View should once again appear in the Insert tab, in the Reports section.

I have no idea why Power View has been removed from the ribbon by default. It may just be temporary given that we’re not yet at release, but it could signal some other change. In any event, if you work with both Power View and Excel 2016, you can continue to do so.