If you are new to Power BI, or if you’ve worked with Power BI Desktop, you’re familiar with the concept of refreshing data. By default, Power BI caches data which needs to be refreshed on a periodic basis. Reports that use Direct Query datasets do not need to have their caches refreshed, but to see data changes, the report pages themselves need to be refreshed. If the requirement is to have visuals on screen refreshed without any user intervention at all, it is necessary to use a streaming dataset.
Unlike regular datasets, data is not “pulled” into a streaming dataset, rather it is “pushed” in through the Power BI API, Microsoft Flow, Azure Stream Analytics, or third party services such as PubNub. This article aims to explore the various ways of working with streaming datasets.
Creating and populating a streaming dataset
Streaming datasets are created directly in the Power BI service itself or through the Power BI API. Unlike with other dataset types, there is no schema to read in from an external source.
To create a streaming dataset, choose “create” from a workspace menu in the upper right, and then select “Streaming dataset”.
Select the type of the streaming dataset. For both API and flow, choose the API option. then select “Next”.
Streaming datasets contain only a small subset of the data types supported by regular datasets. These types are Text, Number and DateTime. Give the dataset a name and then create all the necessary fields.
It is important to choose the dataset correctly, as there is no opportunity to transform fields into different types within reports or dashboards.
As fields are added, the JSON definition of the dataset is available. This can be copied and used by the data source that is pushing the data into the dataset. Note that Microsoft Flow can read the schema directly, so that copying is not necessary
In order to use the techniques outlined below, it is critical to turn on the “Historic Data Analysis” switch. This switch changes the dataset from a streaming dataset to a push dataset.
With a streaming dataset, data is stored in a cache long enough to display in a dashboard tile and it expires very quickly. a push dataset retains the data permanently up to a limit of 15 MB. In order to create more complex visuals, a report must be created, and a report requires a push dataset.
A push dataset is identified as “Hybrid” in the Power BI dataset list.
The most important option to select in the definition of the dataset is “retain historical data” If this option is not selected dashboard tiles will be able to display current data, but will not be able to display it over any significant time period. Data will be loaded into the dashboard cache for use with dashboard tiles, but when the cache expires, so doe the data. In order to use reports of any kind with a streaming dataset, this option must be selected.
Once created, data can be added through the API, Microsoft Flow, Stream Analytics or PubNub.
It should be noted that data stored in this way will only be available to Power BI, and only until the limits are reached. As the dataset fills, the oldest data will drop off. If there is any requirement to analyze the data over any significant amount of time, it is highly recommended that it also be stored in another location.
Adding a tile
Dashboard tiles can be created directly by Opening a dashboard and selecting “Add tile” from the ribbon menu. Select the Real-Time Data tile, and then select the dataset to use.
Tiles created in this way are limited to several visual types. These types are:
- Line chart
- Clustered bar chart
- Clustered column chart
There are a limited number of configuration options available to these tile, depending on the tile type.Tiles created in this way will display data from the point of creation forward, according to the settings for the visual itself. These values will update in real time as data is added to the dataset, with no user intervention or refreshing required.
In order to display different types of visuals, or to use customize them beyond what is available directly in the dashboard it is necessary to create a report.
Adding a report in the service
Once created, the streaming dataset will appear in the service like any other. As with other datasets, selecting it from the dataset menu will open a new report canvas that can be saved. The report canvas in the service allows any of the Power BI visuals to be used with the streaming dataset.
Visuals on a report do not update automatically as data is pushed into the dataset, but these visuals can be pinned to a dashboard. Once pinned, the dashboard til will update automatically, so in this way, practically any visual can be added to a dashboard and updated in real time. All that is necessary is first create a report.
Creating a report in the service allows full fidelity access to the report canvas and all of the available visual types, but it does not allow for any editing of the data model. If things like calclated measures and columns are needed, it is necessary to create a report using Power BI Desktop.
Adding a report with Power BI Desktop
Power BI Desktop is able to connect directly to datasets in the Power BI service and push datasets are no exception. To connect to a streaming dataset (or any other), select the “Get Data” button, select “More”, then select the Power BI tab. Finally, select the Power BI dataset option, then select “Connect”
Next, select the workspace that contains the real-time dataset, and select the dataset itself. Selecting Load will establish a connection between the report and the dataset,
Because the report uses a direct connection to the dataset in the service, there is not data transformation opportunity and Power Query cannot be used. Additionally, several DAX functions are not available. For example, most of the functions on the “Modeling” tab are unavailable. It is also not possible to create calculated columns, but calculated measures can be created. Using Power BI Desktop, some relatively complex visuals can be created.
Once the report is published to the service, the visuals can be pinned to a dashboard, and once pinned, they will update automatically in real time.
From time to time, it may be necessary to purge the data from the push dataset to reset the dashboard. To do this, the dataset can be temporarily changed from “push” mode to “streaming” mode. This will purge the stored data. Setting it back to “push” will start storing the data again.
To change the mode of the dataset, select the “Datasets” tab from the workspace menu, and then select the “edit” icon for the database that is to be changed.
The option that changes the mode it “Historic data analysis”. Switching it off changes it to a streaming dataset, and switching it on changes it to a push dataset.
At first, it may seem that visualizing real-time data in Power BI is quite limited due to the limited nature of tiles creating in dashboards. However, by using push datasets along with Power BI Desktop built reports allows for relatively complex visuals to be viewed in real time.