Skip to content

Category: Power BI

Ignite 2015 Impressions

I don’t normally do conference summaries, but Ignite was just so big, and there was so much information that I felt the need to record my thoughts around it, and decided to share. Ignite was very much cross product, which is in line with where Microsoft seems to be headed – a focus on the function, not the tooling. With around 24,000 people in attendance, the conference, and the logistical issues that it imposed was too big for my taste, but the amount of information was excellent, and I imagine that I’ll be digesting it for some time to come. For now, here’s how I interpreted it all.

Azure and Office 365

Cloud services are killing it.

Between Azure’s Platform as a Service (PaaS) and Infrastructiure as a Service (IaaS) and Office 365’s Software as a Service (SaaS), Azure Active Directory is already sporting over 450 million active users. Azure Active Directory is what is used by Office 365, and the accounts within are otherwise known as Organizational Accounts. It’s an important metric because I believe that the Microsoft strategy is to own identity online. It makes sense when you look at what they seem to be doing.

For years, they absolutely dominated operating systems. Nothing to this day has ever really touched them on the desktop, but Apple changed the base with mobile, and developers flocked there. Google tried to do the same thing to Apple, and has been quite successful, but not fully so. While Android is in the majority in the mobile space, iOS is still quite strong, and shows no signs of diminishing. Windows isn’t really a factor in mobile, but still dominates the desktop which remains significant (about 300 million units/year), and is a factor on tablets. Microsoft got flanked by Apple and Android, and is holding the fort, but not conquering any new territory.

Microsoft now seems to be focusing on cloud services, and they don’t care what platform is being used to consume them. I think that at the core of this strategy is cloud identity – whether it is consumer (Microsoft Account) or enterprise (Azure Active Directory). With this identity strategy, Microsoft is attempting to again change the base – to outflank both Apple and Google and make the operating system almost irrelevant. Every app they’re putting out now is usually for iOS first, then Android, then Windows Phone. The new Universal app platform likely means that they will come out for Windows (desktop, phone, whatever) at initial launch with iOS, but the bottom line is that an awful lot of effort is going into supporting all platforms all the time. If the apps work well across platforms, then the choice of operating system simply becomes one of personal preference, not of features. It gets marginalized, and Microsoft owns the back end service. That’s why I think that so much effort has gone into this strategy.

Another thing that I sensed at the show was that in the past, all of the talk around identity and federation (ADFS) was about bringing your on-premises identities into the cloud to support a few new services. Now, there seems to have been a real shift, and the reason for adopting ADFS is to bring the Azure Active Directory identities back down on-premises to where legacy applications can use them. It’s a subtle shift, but discernable.

One of the more interesting product introduced into Azure recently is Logic Apps. As far as I can tell, Logic Apps are the cloud manifestation of BizTalk, which is an excellent product with a steep learning curve. Logic apps remove the learning curve and allow you to quickly connect and flow data through multiple systems. The session on logic apps can be seen here:

SharePoint 2016

In the past, SharePoint announcements would warrant their own post, but now SharePoint is probably best seen as part of a greater whole. Details on SharePoint 2016 details were first announced at Ignite, and I feel that the most informative session was Bill Baer’s on Wednesday morning where he outlined the major architectural changes:

Not surprisingly, this release will be very much about hybrid Sharepoint/Office 365 scenarios. Some of the notable items from the talk are:

  • SharePoint Server 2016 Will require 64 bit Windows Server 2012 or Widows Server 10, and SQL Server 2014 SP1 as a minimum
  • Standalone installations are no longer supported. It will be possible to install SharePoint and SQL Server on the same machine, but full SQL Server will be required, and SQL Express will no longer be supported. This obviously raises questions about whether or not there will be a free SharePoint Foundation SKU with the next release.
  • PerformancePoint will in fact be included with SharePoint 2016. I doubt very much that there will be any investments in it at all, but it will at least be there. I’d view this as legacy support.
  • SharePoint 2016 will support SAML claims as a first class citizen. That means that it will be possible to login with Azure Active Directory credentials, and is an example of bringing cloud identities on prem. However, don’t trash that domain controller just yet, I’m sure that service accounts will still need to be NTLM – SQL Server needs it.
  • There will be a new Roles Based installation. It will be much simpler to install and maintain servers with specific roles such as web front end, search, etc. BI will be one of the roles.
  • There will be new boundaries. Content databases up to Terabyte sizes, 10 GB file size limit, list thresholds of much greater than 5000 items (although how much greater was not specified)
  • No more FIM. The user profile engine that we’ve all grown to….. deal with from SharePoint 2010 and 2013 is no longer embedded. The full Forefront Information Manager can be used, but the default profile import mechanism will be the good ol’ User import from SharePoint 2007.
  • Durable resource based links. Every object in SharePoint will receive its own resource based URL. That means that it can be moved around in the farm, and reference URLs will still work. This is like permalinks in WordPress.
  • While not final, a preview was shown of some operational reporting. This is primarily “speeds and feeds” type information that would interest a farm administrator, although simple usage reporting could be seen.
  • Integration with the Office Graph – see below section on Delve.

SQL Server 2016

The next release of SQL Server was announces at Ignite. Its chock full of new things, focused primarily at hybrid operation and analytics. One of the more interesting concepts in this version is the ability to “stretch” a database into the cloud. With this, you can take an on-premises database, and extend it into Azure SQL, specifying rules to determine which data goes where. Given that online storage is significantly cheaper than on-premises, this makes total sense, and they’ve figured out a way to make it work reliably. The overall SQL Server keynote can be found here:

I’m very interested in the analytics capabilities, and the session outlining the improvements to SQL Server BI is found here:

I found the following items particularly notable:

  • A comment was made during the BI session that Microsoft is “Super Committed” to SQL Server Reporting Services (SSRS). Hopefully this helps quell the naysayers. SSRS is receiving a major facelift in this version, bringing a modern design experience. In addition, the parameters pane has received a great deal of attention, adding, among other things, support for cascaded dropdowns.
  • Datazen is a visualization company based in Toronto that was recently acquired by Microsoft. There is a good demo of Datazen in the session, and I highly recommend watching it. It will be included with SQL Server 2016.
  • Datazen has KPIs. It also has “sub-KPIs”. I’m not sure about you, but that sounds a lot like a scorecard to me. This may sound the eventual (see the SharePoint section) death knell for PerformancePoint, given that that’s about all that it uniquely provides to the BI stack.
  • Tabular models in SSAS (and presumably PowerPivot) will support many-many relationships and a host of other new features.
  • Tabular models in SSAS and PowerPivot will have time intelligence built in. No longer will separate time intelligence tables be required. It’s an open question however as to how extensible they will be and when.
  • SharePoint will allow browser editing on PowerPivot embedded workbooks. Currently, you need to launch Excel to edit a PowerPivot embedded workbook.

Office 365 Groups

I attended the roadmap on Office 365 Groups:

(video unavailable as of posting – should be shortly)

During this session, the light really went on for me. Groups was (were? Not sure about the grammar on this…it’s a name) introduced last year and appeared to be a glorified distribution list with Sharepoint artifacts. However, its about to become the center of the Office 365 collaborative experience. It ties together Azure Active Directory objects, a SharePoint site collection, One Note, Skype, and OneDrive into a single cohesive, non-customizable experience. It currently uses Exchange exclusively for social conversations, but full Yammer integration is promised. No date was given for the integration, but my guess is that the target is early 2016.

The current User interface is limited – too limited for my own use at the moment, but during the demonstration, a rather useful interface was shown that is coming soon. You can access groups presently through the Outlook web client in Office 365. I’m running Office 2016 preview on my laptop, and there is a very nice interface contained there. There was chatter, particularly in the Yammer community about confusion as to what tool should be used when, but I think that the coming deep integration of Yammer into Groups will render this point moot’

The next UI, demonstrated in the above session looks really good, and offers a lot of benefits. There is also a mobile app coming very shortly for, you guessed it, iOS and Windows Universal, then Android.

One unanswered question from the show is whether Groups would be available on-premises.

Power BI

Power BI content was sort of sprinkled throughout the conference, without specific focus. There was a session on the new DAX features available in Power BI Designer that is worth a watch from a modeling perspective:

One talk that really impressed me was by Lukasz Pawlowski and Josh Caplan entitled Power BI for Developers:

They cover content packs are mentioned, real time analytics, and an in depth analysis of the “how old” app that went viral during Build.

It was also announced in the SQL BI session that SSRS will in fact be included in Power BI shortly, although little detail was provided. Finally, for development, the best place to get started is http://dev.powerbi.com.

It should also be noted that Power BI was at the center of almost any analytics discussion during the conference. This is by no means a little side project.

Delve/Office Graph

Delve is a newer product in Office 365 that provides insights around what content is relevant in an organization, and how people interact with it. It’s available directly from the app launcher in Office 365, and recently, user profiles have moved to the Delve application. It’s powered by the Office Graph, which in essence an advanced index that contains content from Exchange and SharePoint, and will very shortly, be extensible for multiple content types. The roadmap session for Delve/Graph is available here:

During the session, it was stated that “Delve is the evolution of Enterprise Search”. Given that all of the work on Delve and the Graph is coming from Oslo and the former team from FAST search, this just makes sense. One of the major announcements around SharePoint 2016 was that SharePoint 2016 content can be crawled by the Office Graph to provide both search results and Delve results in Office Graph. The reverse will also be true in that the on-premises crawler will be able to index Office 365 content for search results, but Delve and the Graph will remain in Office 365. The surprise here was that later this year, it will be possible to do the same thing with Sharepoint 2013 through a coming enhancement.

Much of this Graph goodness can also now be accessed through the new Office 365 Universal API:

tyGraph

tyGraph is our product that provides advanced analytics for Yammer. It had something of a coming out party at Ignite, and while we didn’t have a booth or any launch sessions, we were fortunate enough to have several folks, customers and thought leaders present talks that at least in part featured tyGraph. If you’re interested in analytics for your Yammer network, I recommend that you watch some or all of these sessions:

Enterprise Social, from “Ooh, Shiny” to Business Success – Melanie Hohertz, Cargill

The Microsoft Enterprise Social Journey: How We Did It – Chris Slemp, Microsoft

Gain Organizational Insights with Yammer Data Mining and Analytics – Steve Nguyen, Microsoft and Tammy Young Heck, EY

Yammer Mining: Dig in and “Listen” to What Your Big *Social* Data Is Saying – Richard diZerega, Microsoft

4 Comments

Something Went Wrong When Connecting Power BI to SSAS

One of the more powerful features in the new Power BI preview is the ability to connect your Power BI dashboard to an on-premises instance of SQL Server Analysis Services. This is done by installing a connector on-premises that connects to bot the SSAS server(s) and to the Power BI service. Installation is straightforward, but quite often the first attempt to connect to the data results in the “oh-so-helpful” error message, “something went wrong”.

image

The message isn’t very helpful, unless you’re speaking with a support representative.

I have found that the problem in most cases is that the user connecting to the data does not have sufficient rights on the SSAS server, or the server does not understand who the user is. This may very well be the same user that was used to connect from the SSAS Connector back into the service, but that doesn’t matter. The problem is that SSAS does not know about that identity.

To explain, first, we need to consider how the connector is registered.

image

When the connector is registered, two sets of credentials are provided. One credential is used to connect the connector to the SSAS server. This is an Windows credential (typically in the form of DOMAIN\username, and  it must be an Administrator on the SSAS server. The reason for this requirement is that it will be used to funnel all Connector queries to the SSAS server, and it uses the EFFECTIVEUSERNAME feature in SSAS. EFFECTIVEUSERNAME requires admin level access.

The second credential is used to connect the connector to the Power BI service. This one is used to register the connector with the service so that it can be used by dashboard authors, and isn’t extensively used afterward. This credential will be an Organizational Account (i.e. an Office 365 identity/Azure Active Directory) and needs to have enough rights to register a data source with Power BI.

Once registered, it works as follows.

image

When a dashboard user interacts with the dashboard, or accesses the data source, a request is sent to the connector with the credentials of the user making the request. The connector then establishes a connection with the SSAS server, using the admin credentials registered with the connector, and issues the query using the EFFECTIVEUSERNAME parameter, which basically means “run this query using the provided user’s credentials”. The user provided is the one making the request. This allows for per-user level security for Power BI, but unfortunately, it is what typically causes the error above.

The issue is that SSAS only understands Windows (NTLM and Kerberos) credentials. Without doing anything else, it has no idea what an Organizational ID is. So how can it work at all? There are two ways.

The first, is that your domain can be federated with Azure Active directory, specifically with the Azure Active directory that your Power BI (Office 365) tenant is using. Once federation is complete, your AD domain (domainxxx.com) will be registered and trusted with your internal NTLM/Kerberos domain and your users will be mapped to their Azure ID identities. SSAS will then understand who they are, and if granted permission, they will be able to access SSAS data via the Power BI dashboard. This is the only supported method, and is what should be used in a production environment. There is however another way.

If AD federation is not an option in the short term, or you simply need to get a development or demonstration environment spun up, it is possible to “hack” your active directory to allow SSAS to understand the organizational IDs. First, the AD domain is registered directly with Active directory, then the Active Directory users can be set to use that directory. The key part is that the user name in the internal AD (ie DOMAIN\xxx) must match the user portion of the Azure AD account (ie xxx@domainxxx.com). The entire procedure is outline very well by Greg Galloway in this article, and I won’t repeat it here.

Going back to the original error, it would be nice if it could be a little more descriptive. I’d be happy with “Access Denied”. These are early days, and the product is still in preview – I expect this will change. Security also may not be the only cause of this error, but it’s the only one that I’ve seen thus far. 

2 Comments

The New Power BI – Now With Enterprise!

Yesterday Microsoft announced the next step in the evolution of Power BI. It’s getting quite a bit of attention, and rightly so for its aim of bringing Business Intelligence closer to users. Democratizing BI has always proved a challenge – it’s the realm of the gurus in the white coats that hold the keys to the data. Microsoft is aiming to accomplish this democratization through a combination of user focus, and as of yesterday, a drastic change in its pricing model. Power BI just went from about $40 per user per month, to free, or $9.99/user/month for advanced capabilities. That’s quite a drop, and arguably the biggest announcement from yesterday – it will have a massive impact. The detailed price breakdown can be found here.

However, all of the focus around personal BI is, in my opinion, missing a key component. Power BI and its components have always focused squarely on both personal and team BI solutions. That is to say the ability for a power user to model data, visualize it quickly and easily and to share it out with fellow team members. While that capability is certainly retained in the new Power BI, this new version contains the first appearance of enterprise grade BI in the cloud for Microsoft.

To fully understand this, it’s necessary to touch on the Microsoft BI stack as it stands today.

Microsoft BI On Premises

The On-Premises BI story from Microsoft may be confusing, and occasionally difficult to understand, but it is very powerful, and relatively complete. In a nutshell, the story is good from a personal, team and enterprise perspective.

On the enterprise side, there are products from both the SQL Server team, and the Office team. Data warehousing is served by SQL Server and ETL duties fall to SQL Server Integration Services (SSIS). Multidimensional analysis storage is served by SQL Server Analysis Services in both OLAP and Tabular modes, and Reporting is performed by SQL Server Reporting Services (SSRS). The SQL product line doesn’t have much on the client side for analysis apart from SSRS, but this slack is taken up by the analysis tools available in Excel, and through Performance Point services in SharePoint.

Indeed, SharePoint also provides a platform for SSRS via SSRS SharePoint mode, and for Excel based analytical workbooks connected to SQL Server and to SSAS through Excel Services.

On the personal BI side, that role has traditionally fallen to Excel. The pitfalls of importing data into Excel workbooks for analysis are well documented and don’t need to be discussed here, but the bulk of those issues were addressed with the introduction of PowerPivot several years ago. PowerPivot allows for massive amounts of data to be cached within the Excel file for analysis without any data integrity concerns. The addition in recent years of  analytic visuals (Power View, Power Map) and ETL capabilities (Power Query) have further rounded out the offering.

Taking that Excel workbook and sharing it brings us into the realm of Team BI. This is to say that the analyses are relatively modest in size, and of interest to a targeted group. These models may not require the rigour or reliability associated with enterprise BI models. Once again, the technology involved here is SharePoint. A user can take a workbook with an embedded PowerPivot model, share it through a SharePoint library, and other users can interact with that embedded model using only a browser. This capability requires PowerPivot for SharePoint, which is really a specialized version of SSAS, along with a SharePoint service application.

One thing to note about these seemingly disparate approaches is that a power user can build a Power Pivot data model with Excel, share it to a team via SharePoint, and when it requires sufficient rigour or management, it can be “upgraded” into SSAS in tabular mode. This common model approach is powerful, and is key to understanding Microsoft’s entire BI strategy. You can also see here that SharePoint straddles the two worlds of team and enterprise BI.

Moving to the cloud

The BI workload is one of the last Microsoft workloads to move to the cloud, and with good reason. Massive amounts of data present problems of scale, and security or data sovereignty concerns tend to keep data on premises. However, there is a very real need to provide BI to users outside of the firewall.

SharePoint is the hub of BI on prem, so it’s logical to assume that with SharePoint Online, it could continue to perform that function in the cloud. The big catch here is that on-prem, SharePoint is simply the display platform. In the enterprise scenario, users connect through SharePoint to the back end servers. This isn’t an option in the cloud, so enterprise BI was left off the table.

With the personal and team BI scenarios, data is cached in a Power Pivot data model, which could be supported in the cloud. When Office 365 moved to the SharePoint 2013 code base for SharePoint online, rudimentary support for embedded Power Pivot models was indeed added. Essentially PowerPivot for SharePoint “light” was added. I call it light for two major reasons. Firstly, data models could be no larger than 10 MB. Secondly, there was no way to update the data contained within the Power Pivot cache, outside of re-uploading the Excel workbook. This is still true without a Power BI license. The inability to refresh the data renders team BI almost useless, except in static data scenarios.

The first generation of Power BI changed all of that. With a Power BI license, it was possible to install a Data Management Gateway on premises that would connect to team BI workbooks in Office 365 and update them on a scheduled basis. Yes, the gateway had many limitations (many of which have been removed over time), but finally, the on-prem refresh story was solved. In addition, the model size limit was increased to 250 MB. However, we were still left with a number of problems or limitations.

  1. Daily data refresh schedule. Automatic data refreshes could be daily at their most frequent. Manual refreshes could be done anytime
  2. Capacity. The maximum size of a data model was increased to 250 MB, which is relatively small for enterprise scenarios. In addition, refreshes aren’t differential, which means that the entire model is re-uploaded on every refresh
  3. Data sensitivity/sovereignty.  The refresh problem was solved, but because the data is still cached in the workbooks, there can be reluctance to sending it outside of the corporate firewall
  4. Per User Security – Power Pivot data models have no concept of user security in a workbook (tabular models in SSAS do). Security is at the workbook level
  5. Cost. This initial cost of Power BI was $40 per user per month. A power BI license was required to interact with any workbook that had a data model larger than 10 MB. Considering that a full Office 365 E3 license was around $25 per user per month, this price tended to limit the audience for sharing.

All of this is to say that Power BI in its first (and as yet current) incarnation is suitable for personal and team BI only. There has been no enterprise cloud BI story.

Power BI V2

The announcements yesterday outlined the next generation of Power BI. Going forward, Power BI will be available as a standalone offering, at the price points offered above. Office 365 users will continue to be able to use it from Office 365, but Office 365 will no longer be required to use it. In it’s early days, Power BI was a SharePoint app, but a careful examination of URLs in the current offering quickly reveals that it’s actually two apps currently, both running on Azure (not in SharePoint).

If you’ve signed up for the new Power BI preview, you may notice that the URL is http://app.powerbi.com/…… so this move isn’t a big surprise.

With the new model, Excel is no longer the central container. Users connect to data and publish it directly to Power BI. Behind the scenes, the service is doing a very similar thing as what it does with Power Pivot models – it’s storing them in SSAS. In fact, the same limits still apply – 250 MB per model (at least for now) Excel can still be used, but now it is as a data source.

Visualizations are performed through Power Views, and data is acquired through Power Query. These are no longer add-ons, but available on their own through Power BI Designer. This decoupling is good for those that have not made an investment in SharePoint Online, or Excel.

These changes to the architecture and the cost are great news for adoption, but don’t address the needs of the enterprise. Except for one thing – The SSAS Connector.

image

One of the data sources available to the new Power BI is the SSAS data connector. This connector is a piece of code that runs on premises (it actually includes the Data Management Gateway). It acts as a bridge between the Power BI service, and an on prem SSAS server.

The biggest distinction worth noting is that with the gateway, data is NOT being uploaded to the service, it remains on prem. The way that it works is that when a user interacts with a visualization from the cloud, a query is sent to the SSAS server through the gateway. That query is run, and its results sent back to the user’s visualization, and the data is not persisted.

In addition, when the query is sent back to the SSAS it is run with the permission of the user making the request. This is accomplished through the EFFECTIVEUSERNAME feature in SSAS. This provides for full user level security, and since tabular models in SSAS can utilize per user security, we no longer need to rely on proxy accounts/document level security.

Finally, because the data is being stored in an on prem SSAS server, it can be refreshed automatically as often as desired. For the same reason, we have no capacity limits – you can grow your own SSAS servers as large as you like.

The SSAS connector removes most of the limitations that prevent cloud based enterprise Business Intelligence, and the new pricing model removes the rest. Certainly there are going to be feature limits in the near term, but it appears to me at least that the back of this thorny problem has finally been cracked.

6 Comments

Power BI Data Management Gateway 1.4 – Where is it heading?

I received a notice from my main Power BI tenant last night that a new version of the Data Management Gateway was available. The previous (1.2) version contained some very significant changes so I was understandably eager to have a look. I installed it, observed a relatively attractive setup interface, then opened the release notes to find out what else was new. Only four items were listed (from the release notes).

  • Unified binary that supports both Microsoft Azure Data Factory and Office 365 Power BI services
  • Refine the Configuration UI and registration process
  • Azure Data Factory – Azure Ingress and Egress support for SQL Server data source
  • Office 365 Power BI – Bug fixes

I had already observed number two, the new setup experience. Bug fixes, while absolutely necessary, aren’t necessarily something to write about, but I think that the other two items are. While they may not have immediate impact, my bet is that they will in very short order.

The key point here is that the gateway now supports the Azure Data Factory. There are many, many things that the data factory enables (Hadoop anyone), but the one that I feel is most relevant to Power BI today is the ability to connect directly to on premises data sources. That’s not quite how it’s been done until now.

Power BI for Office 365

In the context of Power BI as we’ve come to know it today, on-prem data refreshes are handled by the Data Management Gateway. On a periodic basis (daily at most) the service contacts the gateway, which in turn reruns all relevant queries. The resultant data is then uploaded to the service.

The service in turn packages the data and updates the host Excel workbook, and the model is transferred into a back end analysis server. Every transaction goes through the host Excel workbook.

Power BI Dashboards

If you’ve had a chance to see the preview of Power BI Dashboards, you may have noticed that it is not dependent on Office 365 or Excel at all. When you add a data source, you take the date and add it to a cloud based data model directly (presumably backed by SQL Server Analysis Services). All visualization work against these models, with one very important exception. If you connect to a SQL Server Analysis Services Data source you are actually connecting directly to a model hosted on an on-prem SSAS server in real time.

SNAGHTML5339485

How is this done? The connection is made through the “Analysis Services Connector”, which is a separate bit of software installed on prem to facilitate connection between the Power BI Dashboards service and the On-Prem SSAS server. It’s available directly from the dashboards portal.

image

After installing it, a process that establishes for dashboard and SSAS credentials, it can be reconfigured by running the “Power BI Analysis Services Connector” tool.

image

However, installation also adds another piece of software to the host machine. The Microsoft Data Management Gateway. This version of the DMG establishes the connection between the SSAS server and the Power BI service in real time. Up until now, the DMG didn’t work this way, so which version is it?

image

Until now, the most recent version of the DMG was 1.2, so this Dashboards preview contained a glimpse into the next generation Data Management Gateway that provided some intriguing new capabilities.

Coming Together

Checking into the latest version of the Data Management Gateway from Office 365, we see:

image

This version is newer that that included in the Dashboards Preview, and presumably includes everything from it. The key phrase in the release notes to me is therefore “Unified Binary”. One gateway to rule them all, if you will. Does this mean that we’ll be able to connect to on-prem data in real time from Office 365 as well as from the Power BI preview? I don’t know how, but I bet that the building blocks are now there.

The latest version may not include support for any new data sources, or any new bells and whistles, but it’s likely worth setting up for new capabilities that will hopefully show up sooner rather than later.

1 Comment

Check-In Tracking with Power Query and Power View

For me, 2014 was a particularly busy year. I was fortunate enough to have been invited to speak at a number of international conferences, and took a few wonderful personal trips as well. When travelling, I have gotten into the habit of doing Facebook check ins as I go. It occurred to me after describing my year to a friend, that since Power Query supports Facebook, I should be able to visualize my travel year easily enough.It wasn’t only easy, it was also very fast.

Acquiring the data

Getting data from your Facebook feed is really quite easy. Once you have Power Query installed, open a new Excel workbook, select the Power Query tab. Then from the “Other Data Sources” button, select Facebook.

image

Unless they’ve already been saved, you’ll be prompted to enter Facebook credentials. Once you’ve done so, you’ll be prompted for a Facebook object, and a Facebook connection (or feed). image

There is a wealth of data that can be retrieved, but in our case, we’re interested in check-ins, which are expressed in Facebook as posts. Therefore we select “me” (the currently logged in user) and “Posts” and click OK. The Power Query editor loads with all of our Facebook posts. It should be noted that Power Query is using Facebook Graph to perform the query, as can be seen in the function bar:

image

That import function can be edited, so any valid Facebook graph expression can be used to query data, not just those automatically generated by the Power Query UI.

Unfortunately, “Posts” contains a lot more data than just check ins, so we need to filter it down. We do that be applying a few transformations. All check ins are status posts, so we start by filtering the type column, selecting “status” as the filter value.

image

Not all statuses are check ins. However, all statuses with a value for location are. Where is location? It’s in a related record in the “place” column. The place column values are related records, so first, we need to expand the place column. To do so, click the small “expand” icon next to the column title.

image

Next, select the columns that you want to expand. In this case, all we need is location, so that’s all we need to expand. When OK is selected, the location data is expanded, and the original place column is replaced with a new place.location column. However, the values for location are themselves records, so they also need to be expanded. We then follow the same procedure that we did for “place” to expand them.

image

Once the columns are expanded, it is fairly obvious that only some entries have values for longitude and/or latitude. These are our check ins. We need to filter out all of the other records, and we can do this by filtering latitude or longitude with using the same approach that we used on the “type field earlier. This time, we deselect the null value.

image

We are left with nothing but check ins as records.

Now it’s time to clean up the data – we don’t want to load any more data than we need to perform the analysis, so we delete any unnecessary columns. Also, the expanded column names are cumbersome, so we rename those as well. Finally, the two date fields don’t get auto-detected as dates, so we need to change their data types to Date/Time. IN our case, the completed query appears as below.

image

Visualizing the data

We are now ready to visualize the data, so we select close and load. In our case, Power Query has been set to load only into the model, but your settings may vary. Since we’ll be using Power View, we need it in the data model. After the data has loaded, we select the Insert tab, and select Power View from the ribbon.

image

The first thing that we want to see is a map of our logins, so we select Country from our Power View fields which loads a list of the countries. Then we select Map from the ribbon, which loads the map. Next, we resize the map to fit our requirements. Finally, we drag the id field to the size field (visually showing number of check ins per location) and then we drag the city field under country in the locations field. At this point we can double click on a data point to drill down to a city.

image

We want to restrict the data to 2014, so we drag the created_time field into the filters column, select the advanced filters option, and select greater than Jan 1 2014 and less than Jan 1 2015.

image

Finally, we want to show a list of all of the locations, so we click just below the map visualization and select created_time, Country, state, and city from the fields list.

Normally, I’d finish the post up with a screenshot of the finished post, but this post on the Power BI blog yesterday inspired me to try embedding the actual product. Play around and have a look!

 

1 Comment