This post is the third in a series exploring Power BI and complex data types in SharePoint. The first post explores working with multi-value columns, and the second post covers working with person fields. In this one, we’ll explore some of the nuances of working with managed metadata fields
Managed metadata fields in SharePoint are implemented as a special case of a lookup field. As managed metadata is used in SharePoint lists, a hidden list in the root of the site, TaxonomyHiddenList is populated with the values used. The Power BI SharePoint list connector is aware of this, and it provides helpers to make it relatively easy to get the value of these fields. It is also possible to get retrieve some of the extended properties of these fields, specifically all the language variants of the managed metadata values. We’ll examine several approaches to extracting managed metadata information.
Consider the following list that contains a managed metadata field named “City”:
This view displays the value of the managed metadata field in the language of the site. If multiple language values have been defined for the term, the appropriate one will be used. While not shown in the view, these multiple language values, along with other attributes of the term are available to Power BI as needed. Depending on requirements, there are several ways to access the terms values and attributes in Power BI. To do so, the report must be built using Power BI Desktop.
Loading the Data
We first launch Power BI Desktop, select “Get Data” and then choose SharePoint Online list (if connecting to SharePoint Online) or SharePoint List (if using SharePoint Server). We are then prompted for the URL of the SharePoint Site. The dialog is titled SharePoint lists, but the value is the URL of the site, NOT the list itself. Once this is entered we are prompted for credentials if we haven’t connected to this site before. After entering credentials, we can select the list that we want to report on. In our case, it’s named “Listings”. We select it, and then click on the Edit button.
Once the data loads in, one of the first things that you’ll notice is that there are a lot of columns to choose from, and it’s a good idea to remove the data that you don’t need. In this case, we can remove the ContentTypeId column and everything to the right of it, with two important exceptions We want to keep the “FieldValuesAsText” column (we’ll come back to that shortly). Our simplest requirement will be to extract the value of the metadata column in the default language of the site, the same way that it is represented in the SharePoint list view. We can accomplish this using the FieldValuesAsText column.
Extracting the Term Value
One thing that you will notice right away is that he more simple column types like “Title” show their value directly in the Query editor. In our case, our managed metadata column, “City” shows a clickable “Record” value for all rows. Clearly, “Record” is not what we are looking for, but this is how Power BI represents a one-to-one relationship. In this case, the relationship is between the source data row, and the row representing the managed metadata term. We will explore using this column in the next section, but if all we want is the value of the term, we can have Power Query (the Power BI Query editor is really a user interface for Power Query) automatically extract it for us using the “FieldValuesAsText” column.
We scroll right and select the expander icon for the “FieldValuesAsText” column, then deselect all available fields except the “City” column, as well as the “Use original column name as prefix” option.
We then select OK, and a new column “City.1” is substituted for the original column with the value of the term.
The reason that “.1” is added to the column name is that we already have a column named City. If the value is all that is needed, it can safely be removed, and this column renamed. However, if additional term data is required, a more complex approach is required.
Extracting the Managed Metadata Properties
As noted above, the “City” column links to the managed metadata term. It can be used to get at the underlying term properties by clicking on the column expander to the right of the column title and selecting the TermGuid column.
The Term column will display the id of the label of the managed metadata field, which is not what we need here. TermGuid will ultimately link to the term properties that we need, but when we click OK, we can see that the resulting values are simply a collection of GUIDs. Since we will use these values to link to the data we need, we must first set their data type. We do this by selecting the column, then the “Transform” tab in the query editor and selecting the “Text” data type.
At this point, we are ready to link to our terms list.
Linking to the Hidden Taxonomy List
As mentioned above, SharePoint adds items to a hidden list whenever managed metadata terms are used. This list, “TaxonomyHiddenList” is used for all lists in the site. We therefore need to add this list to our query. To do so, we repeat the procedure used above to get our listings data, substituting “TaxonomyHiddenList” for “Listings”.
As mentioned above, this list will contain values for all instances of managed metadata in the site, not just the field that we’re after. While not strictly necessary, it’s a good idea to filter out the values that are unnecessary, particularly if we want to use the term as a slicer. If we don’t all sorts of values could appear that make no sense to our report. In our example, we have listings with two cities, “Guelph” and “Elora” but there is also a value for “North America” that comes from another managed metadata instance. Examining the data we can see that the “IdForTermSet” column uniquely identifies the “City” column. Filtering on the GUID for our column will remove all extraneous data.
The Unique ID of the term in question is stored in the “IdForTerm” column. The term’s value is stored in the “Term” column, and the hierarchical path to the term is stored in the “Path” column. These columns will hold the value of the term in its default language. Scrolling right, we can see that there are many columns titled “Termxxxx” and “Pathxxxx”. These correspond to the term’s value in various languages, the language being identified by the xxxx value – it is the code of the language in question. For example, the language code of French is 1036, so if we want the French values for the term, we will need to use the columns Term1036 and Path1036.
If the term is not in a hierarchy, both the Term and the Path values will be identical. If values have not been defined for he additional languages, then the default language value will be returned. It is important to identify the data needed, and to remove all redundant column data at this point to avoid bloating the data model. In our case, we only need the default language, and there is no hierarchy, so we remove all columns with “Path” in the name, and all columns with “Termxxxx”. In addition, we can remove all the extended columns, and system columns. In the end, in our case we are left with the GUID for the term (named IdForTerm), and the term itself in the default language.
The final step is to change the data type of the “IdForTerm” and the “Term” columns (and any other columns necessary) to Text, using the procedure for “TermGuid” above. Given that “Term” is generic, and we have filtered ours down to the city terms, we also rename the column to “City”.
When ready, we select the “Close and Apply” button from the Query Editor Ribbon. At this point, we have two tables in the model, Listings and Taxonomy (renamed from TaxonomyHiiddenList). We then select the relationship editor tab. By default, Power BI may assume a relationship for us. In our case, we retained the id field from the taxonomy list, so it established a relationship between the id fields in the two tables, which is incorrect. We need to select the relationship by clicking on the line between the two tables, and then pressing delete. Once deleted, we need to establish a relationship between “TermGuid” in the Listings table, and “IdForTerm” in the Taxonomy table. We establish this relationship by dragging one column onto the other. Once established, we double click on the relationship line to set the value of “Cross filter direction” to “Both”.
We can now return to the design pane, add a table visual, and add columns from both tables. In this way, we can slice a listing report on the value of the managed metadata column, in any language that we have defined for it.
Clearly, it this case, if we simply want to get the default language value of the term, the FieldValuesAsText approach is far simpler, but linking to the hidden taxonomy list makes it possible to access all available attributes of the managed metadata term.