Power BI is without question the best way to report on data in SharePoint lists. The query tools available in Power Query make working with SharePoint data relatively painless, an the cached dataset means that reports are run against an optimized copy of the list data, not the data itself.
This latter distinction, while removing the performance issues of systems that query lists directly, also introduces problems with data latency. The report will never be fully “up to date”, as it needs to be refreshed on a periodic basis.
Consider the following scenario. A Power BI report has been built that uses data from a SharePoint list. That report has been embedded on a SharePoint page in the same site. A user adds an item to the list, and then navigates to the page to see the updated report. Unfortunately, that report won’t get updated until the next scheduled refresh.
This has been a significant problem, until the recent release of the new “Refresh a dataset” action in Microsoft Flow.
It is a relatively simple procedure to add a simple 1 step flow to any SharePoint list that is triggered when an item is created, updated, or deleted. This flow simply needs to add the “Refresh a dataset” action, that is configured for the relevant dataset, and these embedded reports will be updated very shortly after the data is modified.
Alternatively, the flow can be triggered by a timer, allowing you to create your own schedule (every 5 minutes, etc) that is not hardwired to run at the top or bottom of any given hour.
A few caveats should be kept in mind when using this action however.
While this action gives us much finer grained control over when refreshes happen, all of the current license restrictions remain in place. For datasets located in the shared capacity, only 8 refreshes per day are allowed.
For datasets in dedicated capacities (Premium), there are no limits to the number of refreshes. The limit of 48 per day is a UI restriction, not a licensing restriction. However, refresh can utilize significant resources, particularly memory, so you’ll want to ensure that you have significant resources to support the update frequency.
Finally, the load on the source data system should be considered. Refresh will pull a significant amount of data every time it is run.
Caveats aside, this new flow action is a welcome relief to those that need greater control of how their reports are updated.
In addition to all of the obvious business benefits that Microsoft flow brings to the table, one of the things that initially struck me about it was how useful it would be for data acquisition purposes. The first thing that I did with it was to create a flow that queries weather stations from Weather Underground, and stores the data in SQL Azure, and uses Power BI to analyze the weather patterns.
I may blog about that solution in the future, but with the Future of SharePoint event rapidly coming up, my BI Focal fellow collaborator, Jason Himmelstein convinced me that there was something more interesting that we could do with this. How about near real time monitoring of Twitter conversations for the event? All of the pieces were in place.
There are currently essentially three different technologies necessary to make this happen.
Let’s go through each one.
We could store our tweets in a variety of locations (CSV, SharePoint, Excel), and there are already a number of examples out there that demonstrate how to do this. The reason that we want to use a SQL Azure database is twofold. Firstly, Flow has actions for connecting and inserting data into it. That takes care of our storage requirement. The second, and most important part is that SQL Azure databases support DirectQuery in Power BI.
With DirectQuery, Power BI does not cache the data – every interaction results in a query back to the source, in our case the SQL Azure database. This has the effect of making the data available for reporting as soon as it has been delivered by Flow. That’s the theory at least. In reality, Power BI caches certain elements temporarily (dashboard tiles for example), but this is as close to real time as you can get in Power BI without writing data directly to t in the API. Reports are for the most part up to the minute.
You need an Azure subscription to create a database, and the process for creating it is documented in the following video.
We will be using the Twitter trigger with Microsoft flow, and it has several output variables. We want our table to be able to store the values of those variables in a table, so we use the following script to create that table.
CREATE TABLE Twitter( id int IDENTITY(1,1),RetweetCount int,TweetText NVARCHAR(250),TweetedBy NVARCHAR(100),CreatedAt NVARCHAR(100),TweetID NVARCHAR(50),SearchTerm NVARCHAR(50));Goalter table Twitter add primary key (ID)
Once created, we are ready to fill it with tweets.
The recently announced Microsoft Flow is a tool that allows users to automate and integrate processes from different data sources in the cloud. It is based on Azure Logic Apps, and is currently in preview, but already supports a wide variety of actions and triggers. You can sign up for, or access your flows at http://flow.microsoft.com.
Flows consist of two primary objects, triggers and actions. Most triggers, and at the moment all actions, are tied to a data connection. You can register your connections as you go, but you can also view and register them en-masse by selecting your person icon and selecting “My connections”.
Once registered, you can use “Browse” to start from a template, or you can go to “My flows” to start from scratch. That’s what we’ll do. To start, click on “Create new flow”, and you will be presented with the trigger selector.
Most of the available triggers are events, and the first 4 are special cases. The recurrence trigger allows you to schedule your flow. This is what I use for my weather gatherer – it just calls a web page every 5 minutes and passes the result into the next action. The external content source actions are in alphabetical order, so we just scroll down to the Twitter action and select it.
If you have already registered a Twitter account, it will be used by default. If you want to change it, or add a new one, just click on “Change connection”. It’s a good idea to use multiple Twitter accounts if you’re doing multiple queries to avoid running afoul of Twitter’s rate limiting. Finally, just enter the search term in the Query Text box. Any new post of that term on Twitter will launch the flow.
Next, we need to add the “SQL Azure – Insert Row” action. To do so, click on the “+” symbol, click add an action, then click “Load more” at the bottom. Scroll down and select the action.
Again, if you have a database registered, it will be selected by default. If you have multiple databases registered, or want to add more, click on “Change Connection”. Once you have the correct connection selected, you can click on the dropdown and select the correct table (the one created above”). Once selected, the fields will load in to the action.
Populating the fields is a simple matter of selecting the appropriate output variable from the Twitter trigger. The final field, SearchTerm, is used to distinguish between different Twitter searches. Each flow only triggers on one term, but we want to set up multiple flows. We manually enter the value here (in our case “FutureOfSharePoint”). Later, that will be used as a slicer in Power BI.
Once complete, give the Flow a name, click on “Create Flow”, and then “Done”. At that point, you really are done. That’s it, that’s all there is to it. You can query SQL Azure to check for data, and you can also use the information icon to check on the status of Flow runs.
All of these steps are well documented in Jason’s video below:
We want to surface this data with Power BI. We can do this directly from the web interface, but we have a lot more options if we design the report with Power BI Desktop. The next step is to launch Power BI Desktop, Select “Get Data”, select “Microsoft Azure SQL Database” and press the “Connect” button. At this point, you enter in the details about the Azure SQL Server and database, and most importantly, select the DirectQuery option.
The import option will retrieve data from the SQL database and cache it in an embedded model within the report. Once published, the Power BI service can keep it refreshed, but no more than 8 times per day. This is contrasted with DirectQuery, where no data is persisted in the service, and every interaction results in an immediate call back to the data source. For frequent updates, this is what we need.
A word of caution here – we pay a significant penalty from a feature standpoint when using DirectQuery mode. Most of the functions in Power Query and many of the functions in DAX are unavailable to us in this mode. However, with this particular data set, these restrictions are an acceptable tradeoff for the frequent updates.
Again, Jason has done a great job explaining the steps required to build the reports and dashboards in the video below, so I am not going to repeat them here.
Once the report is published, you may want to present it to a wider audience. You can do that through dashboard sharing if your recipients have access to Power BI, or you can publish it anonymously. Given that this is Twitter data, it’s certainly public, and there is no harm in doing so.
To publish the report anonymously, simply open the report in the Power BI service, and select File – Publish to web.
You will then be presented with a dialog box that will give you both a link and an embed code for 3 different possible renditions of the report. Simply select the one you want to use and paste it into the ultimate destination. My report can be seen below, and I will likely update it from time to time to follow current events.
One thing to keep in mind about reports shared anonymously is that even though the report is using DirectQuery, the visuals are only updated approximately every hour. The above report will lag reality by about an hour.
You can see here the power of these tools working together. Flow is an easy to use but yet powerful integration tool. SQL Azure is a rock solid database available in the cloud to other cloud services, and Power BI allows for rapid insights to be built by Power users. No code was harmed in the building of this solution, but regardless, it’s still quite powerful.
From here, I can only see it getting better. My ask from the Flow team? A Power BI action that pumps data directly into a Power BI data model, thus eliminating the need for the Azure DB, and allowing for self updating visuals in Power BI, but that’s a topic for another day.