Tag Archives: Power BI

The Power BI Premium Pricing Model – The Good, The Bad and the Ugly

On May 3, Microsoft announced sweeping changes to the pricing of Power BI by introducing a new “Power BI Premium” SKU. The announcement itself can be found here, and there a number of other related resources worth that I am listing here for convenience:

Distribute to large audiences with Power BI apps
Changes to Power BI embedded
Power BI Premium White Paper
Power BI Pricing
Premium pricing calculator
Introducing Power BI Report Server

Power BI Premium is intended to address deficiencies in the current pricing model primarily with respect to sharing content. In my opinion, the new model succeeds in this goal for the most part, but it leaves a significant number of customers behind, and it also leaves many unanswered questions. These problems need to be addressed for Power BI to succeed in its goal of bringing BI to the masses. Overall, I like what Microsoft is trying to do with this new pricing model, and with a few tweaks, I think that it can resonate.

First, we need to understand the new model, and to do that, we need to understand the former model and the need for Premium. Given that the former model (consisting of free and Pro licenses) has not been replaced (although it is changing significantly), we will refer to it as the original model, and when Premium is added to it, we will refer to that as the Premium model. The original model is still completely relevant moving forward.

The original model and the need for change

The original model is relatively simple, and relatively unique to the industry. Power BI users are licensed for either free or Pro features. If a report contains any Pro capabilities, any consuming user requires a Pro license. A free user can create a report that uses Pro features, but that same user will not be able to consume that report in the free service. This is a very important distinction to understand. The author of a report (using Power BI Desktop) could be a free user, use a Pro feature, and after deploying the report to the service, be unable to use it in the service.

Difference between free and Pro from a feature standpoint is no longer (as of this writing) available on the Power BI pricing page, however, prior to June 1, 2017 it is the list below.

Therefore, if a report is configured to be refreshed more than once per day, or even if the time of day is specified, or if the report uses on-premises data, then all users accessing that report require a Pro license. Given that Power BI is all about bringing Business Intelligence to the masses, when each one of those masses needs to pay $10/month, it tends to constrain adoption, particularly if a report’s audience is organization wide, and you are in a very large organization.

Report sharing is also relatively limited. Reports can be shared anonymously, which is insecure. Dashboards and their constituent reports can be shared either internally or externally, but they are read only. Finally, both dashboards and reports can be shared through Group workspaces (now app workspaces). Currently, Group workspaces do not allow for external sharing, but they are the preferred means of sharing. However, they too require Pro licenses, which constrains adoption. For the free user, anonymous and dashboard sharing are the only real options.

New model

The introduction of Power BI Premium aims to solve some of the sharing issues listed above, and therefore to help drive adoption. Premium capacity is an add on to a Power BI tenant (organization), and is different that free or Pro licenses which are assigned to users. An organization can purchase Premium capacity, and then a Pro user (this is restricted to Pro) can move or publish content to the Premium capacity. Once the content is in Premium storage, all users can utilize all the features in the dashboards and reports. Premium effectively removes all feature barriers from the reports.

Premium storage also brings many performance enhancements, such as the ability to refresh data up to 48 times per day (vs the previous 8), and the effective removal of data model size limits.

Without Premium, there are also several changes to the original licensing model. According to the May 3 Announcement FAQ on the Power BI community site:

Beginning June 1, the free service will have capabilities equivalent to Power BI Pro. This includes the same 1 GB workbook size limit, up to 8 daily scheduled refreshes for datasets, and maximum 1 million rows/hour streaming data rate. We’re also providing access to all data sources, including those available through the on-premises data gateway. Peer-to-peer dashboard sharing, group workspaces (now called app workspaces), export to PowerPoint, export to CSV/Excel, and analyze in Excel with Power BI apps are capabilities limited to Power BI Pro.”

Therefore, after June 1, 2017, Pro features are effectively an addition to the free features, and the feature differences should be as below:

From the May 3, 2017 announcement:

“Going forward, we will improve the free service to have the same functionality as Power BI Pro, but will limit sharing and collaboration features to only Power BI Pro users.”

The only features that Pro will have that free will not are those that are related to sharing. The above feature list reflects that.

Power BI Embedded

Power BI Embedded is the way that developers can embed Power BI in their applications. Using Power BI Embedded, until now, developers build reports, deploy them to their Azure instance, and call them from their applications. End users do not need any sort of Power BI licenses, and the developers are charged per report “render session”. This charging model has been one of the criticisms of Power BI embedded in that it is very difficult to predict costs. ISVs are at the mercy of the end users viewing reports, and any measure that is put into place to curb these render sessions is by definition a disincentive to adoption.

The fact that Embedded runs in a different namespace than the core Power BI service is another, leading to differences between the capabilities of Power BI Embedded and the core Power BI service. For example, the current iteration of Power BI Embedded cannot use the On-Premises Data Gateway, which can be quite restrictive.

Power BI Embedded is changing to use the new Premium capacity model. ISVs will purchase Premium capacity, and serve reports to their end users from that space. There will only be a single namespace for all Power BI content.

What’s Good

Power BI Premium solves to sharing problem for organizations that want to distribute their BI assets across the organization. If organizations would be accessing on premises data, a key feature of Power BI for enterprises, the Pro license requirement has discouraged adoption. With Premium capacity, an report publisher can share content with as many users as necessary without worrying about licensing the target users. Even better, those target users can be external, further extending the reach of that content.

For large enterprises, this has the potential to turn Power BI from a niche solution to something that is used by everyone.

The changes to the original model also makes things clearer for report designers and publishers. These publishers can work with the full range of Power BI features while the report is being built, and while they are themselves using it. When it comes time to share the report to a wider audience, they can publish it to Premium capacity where anyone can access it. If the organization has not purchased Premium, then the original model applies, and all recipients will still require a Pro license.

On the Power BI Embedded side, switching to Premium capacity completely eliminates the unpredictability of the current model. The fact that the reports will be rendered from the core Power BI service means that it will be fully on par with other Power BI reports, and developers will be able to take advantage of the full spectrum of Power BI features as they appear in the service.

What’s not so good

If you are a large company, there is very little not to like with this new model. It was large organizations that felt most of the pain with the original model, and it is they that benefit most from the Premium model. In fact, in my opinion, they are the only ones that benefit from the Premium model. Well, they and organizations that have no sharing requirements. The issue here is cost.

The Premium pricing estimator can be found online, but at present, it boils down to this. The smallest block of capacity that can be purchased by an organization is “P1”. To publish content to Premium capacity, you must also have a Power BI Pro license. Therefore, the minimum cost of entry is $4,995 (P1) plus $9.99 (Pro) for a total of $5,004.99 per month. This is well out of the reach of most small to medium sized organizations. In fact, an organization needs to be larger that 500 users (and those would be active Power BI users) for Premium to begin to make sense from a licensing perspective. The model size limit removal and the increased refresh frequency are also compelling reasons to move to Premium, but it’s easy to see that Premium is only for larger organizations.

Compounding this issue for small to medium sized organizations is the fact that as of June 30, dashboard sharing has been removed from the free SKU of the original pricing model. Any dashboards that had previously been shared broadly to free users will cease to function as of the cut-off date. If Premium does not make sense for these organizations, then they do have the option of purchasing Pro for the consumer. To help ease this transition, Microsoft is offering a year’s worth of Power BI Pro to all active free users that signed up prior to May 3, 2017.

However, dashboards can be shared with external users, and it’s a pretty tall order to expect an external user to subscribe to Pro just to be able to read your report.

With Power BI Embedded switching to the Premium model, the ISV now needs to buy Premium capacity. Given that the entry price for Premium is so high, it is (in my opinion) out of reach of most of the services that would rely on it, not to mention those developers that simply want to get up to speed on it or do some testing. There has recently been some indications on the forums that the barrier to entry won’t be as high for developers, but even a figure as low at $600/month may still be too high for many to swallow.

Conclusions

Overall, I think that the Premium pricing model solves a problem that desperately needed to be solved. This approach opens the door to Power BI truly democratizing Business Intelligence and becoming almost as ubiquitous as Excel. The opening up of features to the free SKU and focusing the Pro SKU on sharing means less confusion for report designers.

Unfortunately, for the moment price stands in the way of that goal of many small-medium sized businesses. These businesses may be small in stature, but they are many in number. The removal of sharing from the free SKU actually represents a step backward for them. The floodgates have been opened for large businesses, but the stream has been dammed for smaller ones.

Fortunately, pricing is a simple problem to solve. My hope is that the entry point for Premium comes down to something that would make sense for even a 10-person company, and that the cost for developers using Embedded could scale with far more elasticity, starting at $0 to encourage investment. These changes would, in my opinion, truly set the stage for Power BI dominance.

Use Power BI to Help Manage Your SharePoint Sites

Note – This article first appeared on April 12, 2017 on the Microsoft Partner Network

When it comes to Business Intelligence, SharePoint is most often used as a platform to access dashboards and reports. With the recent availability of the Power BI web part, Power BI joins SQL Server Reporting Services and Excel as a go-to reporting tool within SharePoint.

Occasionally, list data is used as a data source for these reports. This doesn’t work for large amounts of data, but for smaller lists, this is perfectly adequate. Given that Power BI has native connectors for both SharePoint lists and libraries, it is perfectly suited for this sort of task. Combining these two results in some interesting possibilities, as the following article demonstrates.

We work extensively with modern Groups in Office 365. Each group gets its own SharePoint site, and within that, its own OneDrive, or “Shared Documents” library. Depending on the usage of the group, the storage in that library can grow quickly, and it’s not always easy to spot where all the content is being stored. By building a Power BI report that uses the OneDrive as a data source, we can create a report of storage allocation by file and folder, and then show that report on the home page of the SharePoint site.

There are several steps to building this report. It all starts with Power BI Desktop.

Get the Data

To start with, we Launch Power BI Desktop, and Select “Get Data”. Then we select the “SharePoint Folder” file source, and enter the URL of the SharePoint site. Even though we are prompted for the URL of the folder, we must enter the URL of the site itself. The query editor can be used later to filter out any unwanted folders. Only user created document libraries and folders will be returned.

The query will return a number of columns that are irrelevant to this report, and they can be removed. We need to create a column for the URL to the files themselves. The attributes column can be expanded to get the size of any files in bytes. We also use the split function to split the folder path by the “\” delimiter which will allow us to create a folder hierarchy. Finally, we set the appropriate data types on columns, and give them user friendly names.

The scope of this article does not allow for a complete step by step walkthrough of the query editor, but the code below can be pasted into the advanced editor (after replacing the URLs appropriately).

let
  Source = SharePoint.Files("https://yoursharepointsiteurl", [ApiVersion = 15]),
  #"Removed Columns" = Table.RemoveColumns(Source,{"Content"}),
  #"Added Custom" = Table.AddColumn(#"Removed Columns", "Folder", each [Folder Path]),
  #"Added Custom1" = Table.AddColumn(#"Added Custom", "URL", each [Folder Path] & [Name]),
  #"Removed Columns1" = Table.RemoveColumns(#"Added Custom1",{"Folder Path"}),
  #"Replaced Value" = Table.ReplaceValue(#"Removed Columns1","https://unlimitedviz.sharepoint.com/sites/Presentations/Shared Documents/","",Replacer.ReplaceText,{"Folder"}),
  #"Renamed Columns" = Table.RenameColumns(#"Replaced Value",{{"Folder", "FolderBase"}}),
  #"Added Custom2" = Table.AddColumn(#"Renamed Columns", "Custom", each Text.Trim([FolderBase],"/")),
  #"Renamed Columns1" = Table.RenameColumns(#"Added Custom2",{{"Custom", "Folder"}}),
  #"Removed Columns2" = Table.RemoveColumns(#"Renamed Columns1",{"FolderBase", "Date accessed"}),
  #"Expanded Attributes" = Table.ExpandRecordColumn(#"Removed Columns2", "Attributes", {"Size"}, {"Size"}),
  #"Changed Type" = Table.TransformColumnTypes(#"Expanded Attributes",{{"Size", Int64.Type}}),
  #"Renamed Columns2" = Table.RenameColumns(#"Changed Type",{{"Size", "Size (bytes)"}}),
  #"Added Custom3" = Table.AddColumn(#"Renamed Columns2", "Size (KB)", each [#"Size (bytes)"] /1024),
  #"Changed Type1" = Table.TransformColumnTypes(#"Added Custom3",{{"Size (KB)", type number}}),
  #"Added Custom4" = Table.AddColumn(#"Changed Type1", "Size (MB)", each [#"Size (KB)"] /1024),
  #"Changed Type2" = Table.TransformColumnTypes(#"Added Custom4",{{"Size (MB)", type number}, {"Date created", type datetime}, {"Date modified", type datetime}}),
  #"Split Column by Delimiter" = Table.SplitColumn(#"Changed Type2","Folder",Splitter.SplitTextByDelimiter("/", QuoteStyle.Csv),{"Folder.1", "Folder.2", "Folder.3"}),
  #"Changed Type3" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Folder.1", type text}, {"Folder.2", type text}}),
  #"Renamed Columns3" = Table.RenameColumns(#"Changed Type3",{{"Folder.1", "Folder"}, {"Folder.2", "Subfolder 1"}, {"Folder.3", "Subfolder 2"}})
 in
  #"Renamed Columns3"

Build the Report

When the Query is complete, we click on the load the data into the model. We don’t need to do a lot of model editing for this report, it’s relatively straightforward. There is only one table, and the Date Created field gives us enough time intelligence that we don’t need to create a date table. There are two edits to the model that I used that bear mention.

One thing that I wanted to show was the accumulation of storage over time. With the size of the file and the create date, I could show the total size that was added for a given day, month or year, but that doesn’t show the accumulation. To do that we need to create a calculated measure, “Cumulative Size”. The formula below calculates a running total of file size based on date:

Cumulative Size (MB) =
 CALCULATE (
  SUM ( Files[Size (MB)] ),
  FILTER (
  ALL ( Files[Date modified] ),
  Files[Date modified] <= MAX ( Files[Date modified] )
  )
 )

It’s not strictly necessary, but it’s convenient to create a folder hierarchy by dragging subfolder1 onto Folder, and then dragging in subfolder2 to the bottom of it. That allows all levels of the folder hierarchyto be managed as one.

Finally, we add our visual elements to the report. The report itself can be seen above. In this case, the Size by Folder chart uses the folder hierarchy as the x axis so that clicking on a data bar (while in drill down mode) will open a lower level folder. Marking the data category of the URL field will cause the report to display a clickable URL in any tabular visuals, and setting the “URL icon” property (in the Values section) of the table will display a link icon instead of the long URL. Doing this will allow the user to open any of these files directly from the report. The Growth Rate chart used the Cumulative Size calculated measure created above.

Embed the Report

Once completed, we publish the report into Power BI. It is important to select the correct workspace for this. Since we will be embedding the report into a SharePoint page, it is important to ensure that all viewers will have access to the report. By publishing the report to the same Power BI Workspace that is used by the SharePoint site in question, this will be automatic. In this case, we are reporting against the “Presentations” team site that is associated with the “Presentations” group, so we publish this report to the “Presentations Power BI workspace.

Once published, we need to get the embed URL for SharePoint. This can be determined by opening the report in Power BI, selecting File- Embed in SharePoint Online.

Once we have the URL, we navigate to the SharePoint site and edit the home page (note – the home page needs to be a modern SharePoint page). Once in edit mode we add a new web part, and select the Power BI web part. When prompted, we enter the embed code retrieved above. Once the page is published, all is complete.

Finally, the data source in Power BI will need to be set up to refresh on the frequency required.

With a few simple steps, we have not only gained insights into the storage patterns of our team sites, but we have made those insights available to all members of the site in a highly interactive fashion, without making them open another application.

The Difference Between Reporting and Analytics is 42

In his novel “The Hitchhiker’s Guide to the Galaxy”, Douglas Adams envisioned a giant supercomputer named “Deep Thought” that was built to solve the answer to the ultimate question of life, the universe and everything. For the 5 people out there that are unfamiliar with the story, I’ll relate the important bits here. Deep Thought was commissioned by a race of pan-dimensional beings and required seven and a half million years to complete its calculations. When it was finally complete, Deep Thought informed the ancestors of the original creators that the answer was 42. The receivers were understandably disappointed with this response, and when they questioned Deep Thought further, the computer postulated that perhaps the problem was that they never really knew what the question was.

Undeterred, the race then commissioned a second computer (which happened to be the Earth) that would calculate the ultimate question. After a couple of 10 million year attempts, the ultimate question was determined to be “What do you get when you multiply six by nine”. Of course, Adams never claimed that the universe made sense.

To my mind, this is an excellent demonstration of the difference between reporting and analytics. The accurate answer (report) provided a result, but not meaning. Further analytics were necessary to determine context.

Like many information technology terms (Big Data, machine learning, CRM) Business Intelligence (BI) is one of those umbrella terms that many people use regularly without fully understanding its meaning. BI is comprised of many tools that help to glean information and insights from raw data. Thus, an ETL package that moves data from one location to another is just as much a BI tool as is a fancy looking infographic. Combine this lack of clarity with the overloading of the term “reporting, and we wind up with some real confusion in this space.

Reporting is the process of using data to highlight things or trends that have already happened. This can be contrasted with monitoring, which does the same for things that are happening now, and predictive analytics, which tries to predict what will happen in the future based on the same data. The difference between reporting and monitoring is only one of data latency, and as such, monitoring is often referred to as real time reporting, which further muddies the water. However, for the purposes of this article, I want to focus on historical reporting.

Reports are typically one of two types, either operational or analytical. Tools that are good at producing one type are typically not so good at producing the other. What’s the difference? Operational reports are designed to provide information that we know we need, and analytical reports are designed to help us discover things that we didn’t know, or to help answer unanticipated questions. Operational reports are typically designed to be printed. They are typically well paginated, pixel perfect, and provide a single view of the data within any given report. Analytical reports are just the opposite. They are designed with visuals as a starting point, but allow for the ability to pivot on or drill down into the data as appropriate to answer ad-hoc questions. Printing is typically a weakness for analytical reports, whereas drilldown is a weakness for operational reports.

Both report types have their place but they both have very different design point. The data that backs an operational report should ideally be relatively flat, as that best reflects the report layout and helps with performance. Conversely, cubes and data models exist simply because a flat data structure does not adequately support analytical reporting. With analytical reporting, a user may at any point decide to view quantitative data (a measure) through the lens of a different facet (dimension). This difference is so great, that we need a different type of engine to support it. OLAP cubes and tabular models are both examples of this.

Another difference is the data that is necessary to support both report types. Operational reports tend to concern themselves with various levels of subtotals per the predefined facets. In a case like that, the data mart that backs the report only needs to store those subtotals. The granularity, or resolution of the data stored in the data mart does not need to exceed that of report that references it. Analytical reporting is different. Since users will be expected to drill down on data, from on dimension to another, or to filter the data according to increasingly granular facets, it is critical to store all of the data in the data mart backing the data model. We don’t know the level of resolution the analyst will need; therefore, all detail is required.

As a simple example of this, consider the case where we want to analyze some server log data over a period of time. We can pre-aggregate the data in the data model such that it stores the total of the log entries of various entries on a daily basis. There would need to be a total based on each dimension, but the overall data storage would be less than for the raw data. Such data would allow an analyst to spot trends over several days, but the decrease in resolution means that it will be impossible to spot any usage trends within a given day. If daily trends will never be necessary, then this doesn’t matter, but the nature of analytical reports means that the designer can never be sure.

The more that the source data for the report is pre-aggregated, the less that report becomes analytical in nature, and the more it approaches operational. This is regardless of the tool used; you can build either report type with any tool, it’s just that it may not be optimal.

The issue here is one of semantics. Semantics however are important in knowing what you are getting if reports are being provided to you. Calling something “Analytics” does not make it so. If you spin up a content pack in Power BI, and find that the underlying data model provides just enough dimensions and measure to construct the provided report, and that you can’t deconstruct the data in any meaningful way, what you have is a report, not analytics, no matter what the platform. As with anything, there is a trade-off between complexity and power. Given the nuances of this topic, it’s important to look under the hood to know what you are getting.

The answer “42” is perfectly acceptable if you already knew that the question was “what is 6×9?”. But if you want to know why, that takes a little more digging. You’d also know that there might be a data problem…

Analyzing Your WordPress Site with Power BI and Google Analytics

I was recently asked by Christian Buckley what my top 2016 blog posts were. No problem I thought, I just went back to my output for the past year, and pulled out the posts that I knew have had a lot of discussion or impact, and forwarded them on. At that point he asked how many views that each of those pages have had. Being a data guy, I suddenly felt like the shoemaker noticing that his children had been going barefoot.

I monitor my blog traffic with the built-in WordPress JetPack tools, StatCounter, and Google Analytics. They all work slightly differently, with StatCounter and JetPack being the most alike. I tend to rely on StatCounter for immediate stats (how many hits today, what’s popular today) and the Google stats for a longer time frame. StatCounter doesn’t persist my stats beyond a day, as I don’t have the pro version, and the JetPack stats don’t seem very extensible. Google Analytics seemed like the best place to begin, particularly because there is a pre-existing content pack for Power BI.

The Google Analytics Content Pack

I have used the Google Analytics (GA) content pack casually and for demonstration purposes since it was introduced with the Power BI launch in July 2015. It hasn’t changed much. Actually, as far as I can tell, it hasn’t changed at all. To use the content pack, you simply log into the Power BI service, select “Get Data”, select the “Services” tile, and select Google Analytics.

After you enter in your credentials by selecting oAuth2, Power BI will import your GA data into a data model, and populate a pre-configured report. The report consists of several pages, mostly focused on visitors to the site.

There are some interesting visuals out of the box, and there are more metrics available in the data model if you want to customize the out of the box reports. At the moment, any customizations that are made in this way are not portable, and with the content pack, data is only retained for 180 days, which means that year over year comparisons are not possible. The visuals don’t appear to have been updated since initial release, which means that many of the new Power BI UI enhancements are not there, but they too can be added through customization.

Generally, if you’re going to do a lot of customization, the best tool to use is Power BI Desktop. Reports can then be reused easily and are highly portable. Luckily, in addition to the content pack, Google Analytics also exists as a data source for Power BI Desktop.

Using the Google Analytics Data Source in Power BI Desktop

When Power BI Desktop imports data from GA, it imports all the data that GA has. There seems to be no agreement on how long Google will retain this data, but in practice, GA seems to retain all data since it was originally configured. In my case, that’s a little over two years now, which is fine for my analysis. The first step is to connect to and import the correct data. Start Power BI Desktop, select “Get Data”, choose the Online Services tab and choose “Google Analytics”.

Once you authenticate, you’ll be presented with all of the sites that are monitored by Google Analytics. You’ll want to drill down and open “All Web Site Data”. GA captures an awful lot of information, and the trick is to know what to grab. Grabbing everything won’t work as it only allows for 8 dimensions and measures in a single import. In my case, I am interested in PageViews and Unique PageViews measures, and the Page, Page Title and Landing Page dimensions (under the “Page Tracking” section) measures. In addition, I want the Date, Hour, and Minute dimensions from the Time section.

Once selected, w select OK, and edit the query, giving it a good name like “GA Data”. Finally, we can select “Close and Apply” and the data will be added. This procedure can take a little while depending on the quantity of data.

Once loaded, we need to do a little bit of work in the data model. We imported the dates from GA, but we’ll want to do year/month/day drilldowns, as well as use textual values for month names, day names etc. For that, the tried an true method has been to build a Date table. Power BI itself will actually do some of this automatically for you behind the scenes, but a custom table gives us the ultimate in flexibility. DAX (the Power BI modelling language) makes this very easy. We create a new table by first selecting the “Modeling” tab, and then the New Table button. This allows us to create a calculated table in the formula bar. First change the name from “Table” to something meaningful like “View Dates”, and then add the following formula:

ADDCOLUMNS (
 CALENDAR (DATE(2010,1,1), DATE(2025,12,31)),
 "Date As Integer", FORMAT ( [Date], "YYYYMMDD" ),
 "Year", YEAR ( [Date] ),
 "Month Number", FORMAT ( [Date], "MM" ),
 "Year Month Number", FORMAT ( [Date], "YYYY/MM" ),
 "Year Month Short", FORMAT ( [Date], "YYYY/mmm" ),
 "Month Name Short", FORMAT ( [Date], "mmm" ),
 "Month Name Long", FORMAT ( [Date], "mmmm" ),
 "Day Of Week Number", WEEKDAY ( [Date] ),
 "Day Of Week", FORMAT ( [Date], "dddd" ),
 "Day Of Week Short", FORMAT ( [Date], "ddd" ),
 "Quarter", "Q" & FORMAT ( [Date], "Q" ),
 "Year Quarter", FORMAT ( [Date], "YYYY" ) & "/Q" & FORMAT ( [Date], "Q" )
 )
 

Adjust the beginning and end dates to suit the data in question, click the check mark, and voila, instant date table. There will be a record for every date between the beginning and end dates. It’s a good idea to adjust the properties of some of the resultant columns for display, we want to sort the Month Name Long and Month Name Short columns by Month Number, and the Day of Week column by the Day of Week Number column. Any additional customizations can be made as necessary.

The next step is to establish the relationship between the Date column in the GA table, and the Date field in the new calculated date table. Simply click on the relationship builder icon, the drag and drop the Date column from one table to the corresponding column on the other.

At this point, we can create a visual that shows traffic over time. We create a column chart, and add Pageviews as the Value, then we add Year Month Short (which should be sorted by Year Month Number) to the axis, and we should see site all site traffic over time. Adding Date to the axis and stripping out all the dimensions except Day allows us to drill down on days for a selected month.

Although we can see our site traffic by month, we still can’t answer Christian’s original question, which was “what were the most frequently viewed posts written in 2016“. Google Analytics has no clue when the pages were created. It’s possible to try to imply it from the earliest viewed date for a given page, but the created date is available directly in WordPress. We just need to get the WordPress data into the data model. Thankfully, that is possible through the WordPress REST Add on.

Using the WordPress REST Add-On

REST support is available for WordPress as an add-on. The “WP REST API” is available in the add-on catalog, and on Github here. Once installed, all WordPress content (including posts) is available through a simple http GET request. This is something that’s fully supported by Power BI, and therefore all the relevant post data can be loaded into Power BI through this add-on.

From the Power BI Home tab, select Get Data, then “web” and then use the URL required to retrieve posts. For the blog that you’re reading, it’s https://whitepages.unlimitedviz.com/wp-json/wp/v2/posts. The query will return a list of records. However, there will only be as many records as WordPress shows by default. We need all of them. The add on-allows you to specify the number of posts per page, by adding the “per_page” parameter. Therefore, in our case, it’s https://whitepages.unlimitedviz.com/wp-json/wp/v2/posts?per_page=50 where 50 is the desired number of items per page.

The per_page parameter is all that you need if the number of posts to analyze is fewer than 100, but the limit of this parameter is 100. There is another parameter that can be added to the query, page= that will specify the page number. With this, and the posts per page parameter, it’s possible to get all the posts. There are a couple of ways to implement this in Power BI.

The ideal way is to an “M” function. With a function, you build up a query normally, and then you wrap it in another parameterized query using the advanced editor, passing in the page number as a parameter, and that parameter being used in the subsequent query. The function can then be called from each record of another table, thereby returning all the posts, which is exactly what we need. This approach works perfectly well in Power BI Desktop. Unfortunately, once the model and report are deployed to the Power BI service, it stops working. The Power BI service currently cannot refresh any query that uses replaceable parameters as part of the query.

The other way that this can be handles is to generate multiple queries that explicitly use the page= parameter. The number of queries necessary will be equal to the number of posts divided by 100, then rounded up to the next whole number. In my case, I have 230 posts, and therefor need 3 queries. Once created, all 3 queries can be merged into a single table. This approach is messy, and will require occasional maintenance, but it’s the only one that works for now. Let’s walk through the process.

We’ll start with the first query. As above, we use Get Data, select the Web source and enter the URL for page 1 and 100 posts per page. For this blog the URL is https://whitepages.unlimitedviz.com/wp-json/wp/v2/posts?page=1&per_page=100. The query should show a list of 100 records. Next, we need to turn the list into a table so that it can be expanded. Click the “To Table” button in the ribbon.

Click OK to accept the defaults, and then click the small expand button in the column header (Column1). Be sure to deselect the “Use original column name as prefix” before clicking OK.

At this point, all the post metadata from WordPress should be available. You can choose to keep all or only some of the columns, but the ones that we want to be sure to keep are date, slug, and title. Title needs to be expanded, so we should go ahead and do that – the procedure is the same as the step above, but only the title field is returned as “rendered”. It’s a good idea to rename it to Title. Also, it’s a good idea to set the data type of the Date field to Date/Time here.

Once the query is the way we want it, we’ll want to name it something like “Posts1-100”, and then we need to set its data load properties to not load into the report. We don’t want the data to load into this query because it will only be one merge source of three, and we don’t want to store the data redundantly. To do that, we right click on the query, select properties, and deselect “Enable load to report”. Then click OK.

We now need to duplicate this query for page 2. The easiest way to do this is by copying all the M script generated by the query builder into a new blank query, and then editing it. From the Home tab, we click on “Advanced Editor”, then select and copy all the text in the dialog box. We then close the dialog box, then select New Source – Blank Query. Once opened, we again select “Advanced Editor”, remove the default content and paste the copied text into the box. Finally, “page=1” in the URL is replaced with “page=2”.

We then save the query, name it and set the properties not to load as with the first query. We then repeat all these steps for page 3. At this point we are ready to merge the queries into our “master” query.

To merge the three queries into one, we select the “Append Queries” dropdown from the ribbon, and select “Append Queries as New”. We then select “Three or more tables” and add the three tables and select OK. Finally, we give this new query a name “Posts” but we do not prevent the data from loading. This is our master table. At this point, we are ready to Close and Apply, and return to the main design surface.

This Posts table has a Date column, but it’s actually a Date/Time column. To use a date table, we need to create a new calculated column with just the date portion. With the Posts table selector selected, we select the Modeling tab, and then “New Column”. We then give the column a name (PostDate) and the following formula based on the Date column:

We also want a calculated measure to indicate the number of posts. The process is like that for a new column. We select “New Measure”, and add the following formula to the formula bar:

Posts = CountA(Posts[id])

We will be relating records in the Posts table to records in the GA table, so we need another date table to keep the relationships clean. We could calculate another table as we did above, but it’s even easier to calculate the new one based on the one already created. We simply select “New Table” and use the following formula:

PostDates = ViewDates

Next, we create the relationship between the Posts table and the PostDates table the same way that we did it for the GA table above. Now that both tables are date sliceable, we need to relate them together. In the Posts table, the Link column uniquely identifies the page but the GA table uses the relative address of the page in the Landing Page column. In our case the solution is simple, we need to prepend the main part of the site address in question (in our case https://whitepages.unlimitedviz.com) to the Landing Page. We do that by creating a new column, URL, with the following formula:

URL = “https://whitepages.unlimitedviz.com” & ‘Google Analytics Views'[Landing Page]

Finally, we relate the URL column in the GA table to the Link column in the Posts table.

At this point the model is ready for use in reports.

Building a Report

How to build a report is not the focus of this article, so I’ll just explain the steps taken here. To prepare our data model, we first need to flag the Link column in the Posts table as a URL field. To do that, select it in the UI, then select the model tab. Use the Data Category Drop down control and select “Web URL”.

Next, add a new table to the reports, and in in the Format section, select Values, and set the “URL icon” setting to “On”.

This has the effect of displaying any column that has been flagged with the Web URL attribute as a link icon with a live hyperlink, instead of the entire, often long URL itself.

Next, we add the Title and Link fields from the Pages table, and the Pageviews field from the GA table, and then sort the table by Pageviews. Next, we add two slicer controls to the report – one bound to the Year column of the PostDates table, and the other bound to the Year column of the ViewDates table. Now by selecting 2016 from the ViewDate slicer, and 2016 from the PostDate slicer, we can see, in order with precise numbers, which posts authored in 2016 were most frequently viewed in 2016. With this, I was now able to give Christian an answer.

An answer today is one thing, but an answer next year is another altogether. This report was worth sharing, so it was worth sprucing up a bit. By taking advantage of some of the new table formatting capabilities in Power BI, and importing the chiclet slicer custom control, we are able to make a more visually appealing report. I will also occasionally use a column chart in a report and use it like a slicer when appropriate. With a little bit of formatting work, we wind up with a report that looks something like the following:

Publishing and Sharing

We’re now ready to publish this report. The easiest approach is to simply select the “Publish” button from Power BI desktop. Select the destination, most likely your personal workspace. When publishing is complete, we can select “View in Power BI” to see the report in the service.

Having the report is one thing, but we want this report to be kept up to date. To do this, we go to the datasets section and select our dataset. In the data source credentials, section, we need to set the credentials for both Google analytics, and our WordPress connection (which will display as “Web”). Even though the Web source is anonymous, we have to configure it that way in the Power BI service. Once the connections are configured they should appear in the Data source credentials section with no notices.

When we configured the WordPress data import above, we used 3 queries. That’s good for 300 posts, and my blog is currently at 238, which should be fine for a while. However, once I hit 300, I’m going to need another query. What I’m really hoping for is that by that time the Power BI service will support parameterized data sources for refresh, but either way I’ll need to modify the data source. I’m likely to forget this need about a week after I publish this post, so a reminder is a good idea. Luckily, Power BI supports data driven alerts, which is exactly what we need here.

Alerts are set on dashboard tiles for card date. Our report has a data card showing the number of total posts. Once that card has been pinned to the dashboard, an alert can be set on it for when it reaches a specific threshold. Simply hover over the dashboard card and click on the ellipsis, then the small bell icon.

In our case, we want to be notified when the number of posts are approaching 300, so we set the condition to be above 297. Once blog post 298 is published, I will receive an email and can then act on it.

Finally, I want to share this report with Christian so that the next time he has questions about my blog, he can just look it up for himself. When I tell him this, I’ll say that it’s so he can keep me honest, but really, I just want him to stop bugging me…

We don’t work at the same company and we use different Azure AD tenant. I could share the dashboard externally with him, but it’s even easier to share it anonymously, and anonymous sharing of this data is fine with me. Anonymous sharing of data is relatively straightforward. From the report interface, select File – Publish to web. A dialog will open asking for confirmation, and once opened will provide a URL that can be shared publicly. In the case of this blog’s report, you can simply click here to get the full report in a dedicated window. I can just email that report to Christian, and he’ll have the answers that he’s looking for. The beauty of anonymous sharing is that you are also given an embed code that can be added to any web page. As an example, the fully interactive report for this blog can be seen below.

So, You Can Disable Office 365 Groups After All

After my recent post “You Can’t Disable Office 365 Groups”, I received feedback from a few people, specifically Elaine Van Bergen,
Martina Grom and Joe Stocker that some editing controls have been added in through the tenant that allows Group creation to be disabled in the Office 365 tenant, and that these controls affect all of the user interfaces that can create groups. The procedure is outlined here, and Martina offers her insight on it here . I was a little disappointed that I had missed these newer controls earlier, but quite happy about the discussion that the original article started. It brought to light some of the confusion around this feature. In addition, it also highlights the fact that Office 365 Groups are about far more than just conversations, they are the bedrock of all Office 365 services moving forward.

Having said that, and having tested these new controls, I have a few observations to make about disabling Groups.

Much of the feedback that I received of my original article was “Good, they shouldn’t be disabled anyway, they’re too important”. To be sure the other side of that argument was heard from as well, but I tend to side with the former. For me at least, the group construct represents real value. It is a trade-off between ease of use and control to be sure, but as a container, it’s easy to understand, and relatively easy to work with for end users. The concerns around Groups are focused on governance, and those concerns are valid. If anyone can create a group anytime, and there is not process for organizing or classifying them in place, they can quickly get out of hand, producing islands of information all over.

The new management controls allow for a single security group (not an Office 365 group) to define those that can create Groups. Groups created by these members are available to all, but only these members can create new Groups. Only one security group can be flagged for group creation, so it’s an all or nothing proposition for these group members.

The article above walks through the process of creating these controls through PowerShell with the Microsoft Azure Active Directory Module for Windows. There are a couple of quirks when walking through this process. I found that the article itself contains a typo, the PowerShell command “Get-MsolCompanyInfo” should actually be “Get-MsolCompanyInformation”. In addition, when downloading the module itself, the 1.1.130.0 Preview version is required.

One would think that the GA version (1.1.166.0) would include everything necessary, but one would be wrong. I made the mistake of trying to use that version and I hit a wall. You need the preview version.

The Azure Active Directory management area in the new Azure portal also allows for the management of group creation rights. I was unable to use the interface to initially set these controls, but once set, the controls were reflected in the user interface, and it’s possible to manage them. Azure Active Directory management is still in preview in the new portal, so presumably this will improve at GA. The controls can be found in the Azure Active Directory blade under Users and Groups – Group Settings.

Like their predecessor, these controls don’t remove the option to create a group from the client interfaces. Once the “Create” option is selected, the user is usually notified that this will not be possible. In one case, it simply fails. The following are the different messages that users will receive when they try to create a new Group but are prevented from doing so.

Outlook Web Access

SharePoint

Planner

Power BI

Microsoft Teams

Ideally, the create option would simply be removed from the user interface, but at least these interfaces prevent the user from filling out details before failing with one notable exception. When a new Group Workspace is created in Power BI, the operation simply fails, and the user isn’t notified as to why. It almost seems as if the Power BI team wasn’t notified that these new controls exist.

The remaining workload that is (ok – will be) integrated with Groups is Yammer. With Yammer, when a Yammer Group is created, a corresponding Office 365 group will be created, and kept in sync with the Yammer group construct. This will ultimately be where Yammer notes and files are stored (via OneNote and OneDrive – basically SharePoint) as well as the group calendar (in Exchange). However, according to this Microsoft support article, if Office 365 Group creation is disabled, then the Yammer groups will not be Office 365 connected.

It therefore is now possible to prevent users from creating Office 365 Groups. This will be important to large organizations while they formulate an adoption strategy for Groups, but formulate it they should. Just because Groups can be disabled, it doesn’t mean that they should. Groups are by their very nature a compromise between usability and manageability, and centralizing creation tips the scales on the side of manageability. We’ve had this for a long time with classic SharePoint, and the usability of Groups is what’s so exciting from an adoption standpoint. Almost all innovations in the Office 365 space are now centered on Groups – they are the new foundational unit, and by ignoring them, you miss out on much of the future enhancements.

Caution is certainly advised, but it’s a good idea to move forward with a Groups strategy. Now.