As I’ve explained many times before, querying SharePoint data directly is a bad idea. The SharePoint data storage mechanisms simply aren’t designed for querying of any scale, hence the lookup limitations that have been imposed upon it. The best approach to querying SharePoint list data is to first load it into a data warehouse or data mart of some sort. However, both Reporting Services (SSRS) and Power Query support direct access to SharePoint lists. While I try to strongly dissuade people from doing this with Reporting Services, properly used, Power Query is a totally viable means of querying SharePoint list data.
Why is this? With SSRS, every query goes back to the data source for retrieval. Power Query is different – it’s analogous to SQL Server Integration Services, which is an ETL management product. It loads source data into a repository, in this case, an embedded xVelocity, or Power Pivot model which can be considered a “personal data warehouse”. Queries against this mini data warehouse are fast, and don’t rely on SharePoint retrieval mechanisms, and can be used quite effectively in reports.
There are a couple of subtleties to querying SharePoint list items with Power Query, and I will briefly walk through the process below.
With Excel open, click the Power Query tab, select “From Other Sources” and the select “From SharePoint List”.
Next, enter the URL for the SharePoint site (or subsite) that contains the list you wish to query.
If it is the first time accessing this site, you will be prompted for credentials. If your site is Office365, be sure to enter organizational credentials. If it is on premise, use Windows credentials.
Once entered, you will be presented with a list of SharePoint lists in the Power Query Navigator window. Select the list that you wish to query, in our case, Announcements. When selected, click the edit button to edit the query.
The data, or a subset will load into the query editor window. You will see all of the list item fields expressed as columns, and for the most part, using the correct data type. At this point you can remove any columns that are unnecessary, or filter any undesired rows. There are a couple of SharePoint field types that bear special mention.
Lookup fields are a lookup into another SharePoint list. Internally, the SharePoint item stores this as an ID and display value, but Power Query gives you access to all of the properties of the related item as a one-to-one relationship. Essentially, what you can do is to flatten that relationship by incorporating the related item’s attributes.
If you scroll to a column of this particular type, you will see the value expressed as a hyperlink with the value “Record”. Clicking on it will drill down to one related record, but that’s not what we want to do. We want to expand the properties for all items in the list. The way that you do this is to click on the expand icon in the column header. In our case, we want to expand the “CreatedBy” field. CreatedBy is a standard list field, of the Person type. Person fields are actually a special case of a lookup field, so it exhibits this behaviour.
Here, we are interested in retrieving the user’s name and mobile phone, so we deselect all of the other fields. A new column will be created for every expanded field in the format sourcefieldname.attributename .
Attachments are another special case. There can be multiple attachments for a single list item, a one to many relationship. The hyperlink is therefore “Table”. Clicking on the column header expand for this column looks similar, but with an important difference. Options are available to either expand or aggregate the related items.
Selecting expand will create a new source record for each related item, and the only columns that will differ will be the items selected from the related table (Name in our case). Aggregate will not create any new records, but will summarize the related fields. For numeric fields, they can be totalled or averaged, and for text fields they can be counted.
Once ready, click “Close and Load” from the Query Editor ribbon, and the list data will load to either your model, or your workbook, depending on what your preferences are. Of course, I always recommend that you load to the model only.
Once loaded, any visualizations and queries will work against the model. The data can be refreshed at any point either manually, or automatically if using the Data Management Gateway. Keep in mind however that refreshes will operate against the source list.