Power BI has been able to work with Excel files since it was first introduced. Indeed, it was born from the analytic capabilities in Excel. Users can connect directly to Excel files by using the Power BI service and nothing but a browser. However, depending on the content of the Excel file, and the method of connecting, the resulting products can be very different. In this post I will attempt to clarify this behavior. A subsequent post will detail the options available when working with Excel files in Power BI Desktop.
Excel is a multi-purpose tool. It contains all the building blocks of Power BI, and as such, it is an excellent Business Intelligence client. Excel files are also often used (much to my chagrin) as a data storage container, or as a data transport medium. Understanding how the file is structured, and what you want to do with it is key to making the right choice when combining it with Power BI.
Originally Excel files (workbooks) were collections of worksheets. Analysts could import data into those worksheets and then analyze them with the tools that Excel provided. Although Excel was never intended to be a database, it’s ease of use and familiarity led many people to begin using it that was, and “spreadmarts” (spreadsheet data marts) quickly became a problem. The problems arose because the instant data was extracted from a source it became stale, and the fact that it was being stored in worksheets meant that it could be edited (changing history) and became subject to the data size limitations of a worksheet.
To take advantage of Excel’s analytic capabilities without being subject to the issues involved in worksheet data storage, the data model was introduced, initially through PowerPivot. The data model is a “miniaturized” version of the SQL Server Analysis Services tabular engine that runs in Excel. This data model is read only, refreshable, and highly compressed which importantly means that its only data limitation is the amount of available memory available on the machine running it. Importantly, this engine is the same engine that is used by Power BI – the advantages of which we’ll explore shortly.
Excel of course still needs to be able to use worksheets and be Excel, so we can’t just remove the worksheet capability (which incidentally is effectively what Power BI Desktop is – Excel without worksheets). Therefore, today from a data perspective, Excel files can have data in the data model, worksheets or both. From the Power BI service perspective, the important thing is whether the file contains a data model, as it treats the two cases differently.
Getting Excel Data
From the Power BI service, you click the Get Data button, and then the Get button in the Files tile. You are then presented with one of two dialogs depending on whether you are using a personal workspace, or an app workspace.
Connecting file-based data to a personal workspace
When importing into a personal workspace, there are 4 possible data sources.
A local file is one that is stored on a file system local to the machine being used. Selecting this option will allow you to work with the Excel file stored in that location, but if the file is being used as a data source (data is in the worksheets), then a Data Gateway will be required for any data refreshes. Power BI will also connect to a file stored in OneDrive, either Personal or Business (through office 365). Finally, the service can work with files stored in any accessible SharePoint site (not simply Team sites as the name would indicate).
Connecting file-based data to an App workspace
When importing into an App workspace, there are 3 possible data sources. The Local File and SharePoint – Team Sites options are precisely the same as when importing into a personal workspace. The difference is the OneDrive – Workspace name option replaces the two other OneDrive options. Choosing this option allows you to work with files stored in the “Group OneDrive”. Since every App workspace is backed by an Office 365 or “Modern” group, it also has access to the SharePoint site for that group. The “Group OneDrive” is the Documents library within that SharePoint site. Therefore, choosing SharePoint – TeamSites and navigating to the Documents library will render the same results in a few more mouse clicks, but also give access to all other document libraries within that site.
Connect vs Import
Once you navigate to the Excel file that you want to work with, you select it, and click connect. You will then be presented with two options for the file, Import or Connect.
This choice dictates how the file is brought into the Power BI service. The structure of the file determines exactly what is brought in to the service in both cases.
Clicking the Connect button allows Power BI to connect to and work with the Excel file in place. The workbook is displayed as an Excel workbook in full fidelity in the Power BI interface using Excel Online. The file itself is shown in the Workbooks section in the Power BI interface, and it stands alone from other Power BI elements (except that regions of it can be pinned to a dashboard). Connecting to an Excel report will not create a Power BI Dataset, Report, or Dashboard. All operations, including refresh (see below) are controlled through the workbook.
At no point is the file moved, or “brought in” to the Power BI service. If the file is being stored in SharePoint, or OneDrive, anything done to the file in the Power BI service will be visible to anyone with access to the file itself, whether they are a Power BI user or not. This includes refresh, which will be discussed further below, but the important part to remember here is that if the data in the connected file is refreshed through the Power BI service, and it is being stored in SharePoint (or OneDrive), all users will be able to see updated data the next time that they open the file.
Connecting to an Excel file behaves the same way whether the file contains a data model or not, but the file must contain a data model in order to be refreshed by the Power BI service.
Connected Excel file within Power BI
Importing an Excel file behaves totally differently from connecting to it. When an Excel file is imported, it is treated as a data source to Power BI, and the assets within that file are brought into the Power BI service. Subsequent changes to the source file are not immediately reflected within the Power BI service, but are retrieved through the refresh process.
The way that the assets are brought into the service depends very much on the structure of the file, specifically whether it contains a data model or not. If the file does not contain a data model, then Power BI will use the data contained in the Excel worksheets to construct a new one. This is similar to what happens when a CSV file is imported into the service. If the file does contain a data model, then the worksheet data is imported, and that data model is brought into the service as-is. One important exception to this is if worksheet data uses the same query as an existing model, the worksheet data is ignored, and the data model is brought in as-is. This is important because Excel’s Power Pivot editor can be used to edit the model, creating calculated columns, calculated measures and relationships prior to import. The model that is automatically created when the file does not contain a model has no editing capabilities.
When an Excel file with a data model is imported, the data model (imported or created) is added to datasets, and a link to the dataset is added to the default dashboard for the workspace. If no default dashboard exists, one will be created. A report can then be authored in the service. If the workbook contains any PowerView reports, these will be converted to native Power BI reports and added to the service as well. Any embedded 3D maps are not brought in.
Imported Excel File showing calculated measures
Data refresh options, and behavior depend on both the Get Data choice (connect or import) and the structure of the Excel file.
If the workbook is connected to the service, and it does not contain a data model, it cannot be refreshed. This is true even if the worksheets in the workbook contain data from Power Query queries. This is the only scenario that does not support refresh in any way.
If the workbook contains a data model refresh is supported. The interesting part is that refresh will be triggered not only for the data model itself, but for any worksheets that have Power Queries as a data source. Therefore, a workaround to the lack of refresh support for a worksheet with no data model is to add a blank data model.
For refresh to work, the data source must be available to the Power BI service. This means that the source must be available in the cloud or registered on an available On-Premises Data Gateway.
The important thing to note about connected workbooks is that the refresh options that are performed on them are permanent – refreshed data is stored with the workbook. This means that if the connected workbook is stored in SharePoint, or shared through OneDrive, updated data is available to all users with access regardless of whether they are Power BI users.
Refresh options for imported workbooks are slightly more complicated. As mentioned above data is either imported from the worksheets, a data model imported into the service or both.
If data was imported from worksheets, then the Excel file is the data source from the standpoint of Power BI. If the file is stored in SharePoint or OneDrive, it will automatically be refreshed every hour by default. This means that changes to the underlying Excel file will be reflected back in the Power BI service within an hour. This feature can be disabled, but it is not possible to change to hourly schedule, nor precisely when it will occur.
Refresh options for workbooks in OneDrive/SharePoint
If the file is stored on a file system, it can be scheduled more granularly, but you will need to connect to it through an On-Premises Data Gateway.
If the file contained a data model that was imported into the service, then the original source of data for that data model (the query) is what the Power BI service will refresh from (NOT the Excel file itself). In this case, the refresh options are the same as with most other Power BI data sources – Excel is taken out of the picture completely, and any changes to the source Excel file will not be reflected into the service. The exception to this is if the file had both a data model, and worksheet data that was imported.
In the case of an Excel with both a data model and worksheet data, both cases above will apply. The workbook is used as a data source for the table that was created by Power BI on import, and the original data model’s source is updated independently. This means that changes to the worksheet data are reflected in the Power BI service when refreshed, but any model changes to the original Excel file are not. Both the OneDrive and regular refresh schedules are used for imported files of this type.
Refresh options for a combined data source
The following table summarizes the refresh options available for file structure and connection type.
Get Data option
|Worksheet data||None||Refresh from worksheet|
|Data model only||Refresh from model source||Refresh from model source|
|Data model plus worksheet data||Refresh from model source and worksheet source||Refresh from model source and worksheet|
Both Excel and Power BI are powerful tools in their own rights, and the decision to use one does not preclude using the other and in fact there are many good reasons for doing so. Bringing refreshability to Excel files stored in SharePoint is just one of them. It is however important to understand how it all works in order to get the maximum impact.