A Simplified Method of Working with SharePoint Data in Power BI

Although I typically advise against it, there are valid reasons to report on SharePoint list data directly. Power BI Desktop makes this data quite easy to access – you can use the built in connectors for SharePoint or SharePoint Online, or, due to the fact that any SharePoint list is available via OData, you can also use the OData data connector. Microsoft has recently made improvements to both methods, but the SharePoint connectors bring some significant usability advantages. One of these advantages is what I’m calling “summary columns”.

The Problem

Consider the following SharePoint list with different field types:

Connecting to this list with Power BI desktop and editing the query returns all of the list fields regardless of their visibility in the user interface. Assuming that we want to work with the above field values for analysis purposes, we can discard the fields that we don’t need, and reorder the remaining, leaving us with only these fields.

However, you can immediately see that we’ll need to do some work in order to get our data in a usable format.

The title field is simple enough – it’s value is immediately available, no issues there, and no changes are necessary. From the Name field, several columns are returned. Two are the ID of the name in the site collection’s user list. It would be possible to connect to the root of the site collection, retrieve the user list, and establish a relationship between the tables, but clicking the expand icon will allow Power Query to do a lookup for each ID and return the desired attribute, in this case, Name.

The same is true for the lookup field type – the column can be expanded in order to include any attribute of the source item. The field using managed metadata works in a similar fashion; it can be expanded in order to retrieve the text value of the managed metadata item. Link fields work the same way – the can be split into two columns, the link itself and the description. However, the field containing multiple values is a little different. The great news here is that it’s possible – previous versions of Power Query couldn’t work with multi value fields, but now the SharePoint data source supports it.

Multiple value field values are returned as a list. With list items, the expand icon will duplicate the entire record for each value on the list. This may or may not be the desired behaviour, but remember, this is Power BI. Everything can be aggregated.

Before After

The conclusion to be drawn here is that in order to represent SharePoint list data that is using any sort of control more complex than text or number, we need to do some work. However, the good news is that someone (I’m not sure if it’s the Office Team of the Power BI team) has added a feature that makes this whole process much simpler.

The Solution

After connecting to your SharePoint list, edit the query. Instead of diving in and performing all of these manual transforms, select your multi value column(s) if you have any (this will make more sense momentarily). Select any rich text columns as well, and then scroll right to find a column named “FieldValuesAsText”. Select this column, then right click on it and select “Remove Other Columns”.

The FieldValuesAsText column is our magic bullet. It automatically converts most (rich text fields being the exception) of the more complex SharePoint data types to simple text that work well within Power BI. Simple click on its expand button, select the columns that you want to include in your analysis. I find it useful to deselect “Use original column name as prefix” as well. We are left with textual representations of our field data.

You will notice that the multiple value fields here have their values separated by commas. For multiple values, I tend to prefer the “raw” approach, which is why we retained the multiple value column above. We can still expand it and create a separate line for each value, and remove the column created by “FieldValuesAsText”.

Finally, you may have noted that the Rich Text field isn’t automatically converted. In order to extract useful text from it, we still need to use Power Queries transformation functions such as Replace Value, Trim, and Clean.

In a nutshell, if you’ve been frustrated by formatting or data type limitations when using SharePoint data in Power BI, have another look, and check out the FieldValuesAsText column. It will make everything a lot simpler.

9 thoughts on “A Simplified Method of Working with SharePoint Data in Power BI

  1. Karl G. Schneider (@schneika)

    The SharePoint-List-Connector is very helpful with this “Summary Columns”, because Managed Metadata Terms from SharePoint Taxonomy also be resolved in real plaintext Term-Labels! The Original-OData-Connector is not capable to resolve the Term-Labels in Plaintext..

  2. BrettvG

    Very useful post. But I don’t see any column coming back to Power BI Desktop from my SharePoint site called FieldValuesasText. Any thoughts? Lots of googling around and no explanation found though this blog post is quoted all over the place.

  3. BrettvG

    Figured out that FieldValuesAsText and some other columns only show up if you are using the default API level 15. In our case we find that certain lists don’t work with 15 and require setting to 14 in which case those features and columns are not available

    let
    Source = SharePoint.Tables(“https://xxx.com/sites/Home/”, [ApiVersion = 14]),
    RecruitmentTracker = Source{[Name=”RecruitmentTracker”]}[Content]
    in
    RecruitmentTracker

    Question now is why APIVersion 15 does not work on some lists.

  4. jpw@unlimitedviz.com Post author

    Thanks Brett – that’s very interesting. I was never surewhether this was a feature of Power Query itself, or the SharePoint API. This indicates that it’s the SharePoint API.

  5. Marc Anderson

    In the post, you say “Although I typically advise against it, there are valid reasons to report on SharePoint list data directly.” Can you remind me why you say that? How do you prefer to connect to SharePoint list data., especially in the instance that it’s actively changing?

    M.

  6. John White Post author

    The reason I say that is performance. I have a session on getting data out of SharePoint and into SQL, then surfacing it in SharePoint (Through SSRS normally). In it, I have a demo where I query the same 70,000 item list of data – one in SharePoint, the other in SQL server. SQL returns in about 8 seconds, and SharePoint takes about 70. That speaks to inefficiency and load on the server(s).

    Ideally, I like to do exactly that. Treat SharePoint as a data source and use SSIS to move it to a data warehouse. However, if the SP data load is light (under 1000 items), it’s likely not worth it, unless you have a boatload of instances of it…

    Hope that helps!

  7. Ehsan

    Hi John,

    Thanks for a good article. “FieldValuesAsText” work very well. I publish PowerBI report from Desktop, it is working fine.
    The problem I face is, When I set “Daily Schedule refresh” or “Refresh now” from dataset in web interface (as I don’t want to do a manual refresh everyday) It will make all the data blank. See more detail here
    http://community.powerbi.com/t5/Integrations-with-Files-and/Fetch-Live-data-from-Sharepoint-Online-List-and-publish-it-as-a/m-p/181955/highlight/false#M10385
    I have posted the same on your itunity article, as I am not sure which one are you checking.
    Any Idea?
    Thanks

  8. Vitali

    Thank you for the post! I got to the solution before finding your post, which cost me a few hours trying to clean rich text in PowerBI, so I wish I’ve found your blog earlier :). However, you shed light on many of the questions I had.
    Performance-wise, I noticed that FieldValuesAsText takes quite a while even in the Query Editor, much more than the normal time for adding or removing a column. This makes me wonder about the algorithm for this process. Have you noticed anything similar?
    Anyway, thank you again!

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.