Web Data Connection

Power Query – OData, Web and VSTS

By Simon-Pierre Morin, eng.
(intermediate level)

As part of the Agile tour Québec 2023, and Eastern Township Power BI community, I presented a summary on solutions to be able to connect to Web data sources. In this blog, I will present the technical detail of those connectors and display advantages and warning points.

OData.Feed

As the name implies, this connector read a feed OData. This data feed is still quite popular for large data sets, like data logs and tooling instruments historical and real-time data. There are also several services using analytic cubes or similar technologies that also return the data as a feed instead of a JSON as HTTP-GET response.

One of the main advantages of the feeds is that the response is naturally paginated. The response only receives the first few lines to save on resources and whenever we want to next values, a mechanism send the next page. Be aware that if your system ask for all results at once, you will lose that behaviour. Internally, Power Query and Power BI keep in cache the results and will query the feed only to receive the next result page.

The call syntax is really simple

OData.Feed(queryURL, header, options)

Even if it is optional, many services require the implementation specification as this:

OData.Feed(queryURL, null, [Implementation="2.0"])

The URL structure of a feed is the following:

  • protocol
  • domain
  • OData and version path
  • feed
  • query

So for the following query

https://myserver.ca/_odata/v4.0-preview/Employee?$select=id,firstname,lastname,email
StructureURL Part
ProtocolHTTPS
Domainmyserver.ca
OData Path/_odata/v4.0-preview/
FeedEmployee
Query$select=id,firstname,lastname,email
OData Structure and URL Part Association

In this example, the OData path was simplified. Some servers also have a path between the domain and the actual OData service noted by the _odata in the URL. Note that OData exist in many versions and each service can propose one or many versions. In general, the version in used for the query is specified in the URL (but some exceptions do exist).

It’s important to notice that the format of the query part cannot be put in the Query option, even if they share the name. Also take in consideration that each feed will need a connexion authentication, but it’s per feed, not per query. Hence, if you have multiple queries from the same feed, Power Query will ask only once for your credentials. On the other hand, if you have multiple feeds, each of them will request for the credentials.

For more information on how to use, check the official documentation.

Advantages

  • Quick data acquisition
  • Seamless automated pagination

Warning Points

  • Tendency to multiply the number of connexions

Web.Contents

This generic connector take care of REST API. Versatile, it allows to invoke any GET or POST request. Also, with the “RelativePath” property, it allows to have a single connexion per base URL (the fix part of our queries).

Be aware that your base URL must use a single authority (including anonymous connexions). For example, if you try to connect to a Jira Cloud server and that the address follow that model https://myorg.com/jira/Project1, it is possible that your company rules limit the credentials not on the domain, but rather to a specific path. In that case, your base URL may not be https://myorg.com, but rather https://myorg.com/jira or even include the project name. Here are some usage examples using Web.Contents returning a search results for “Power Query” using a GET query type.

  1. Direct method
Web.Content("https://bing.com/search?q=Power+Query")
  1. Using only RelativePath
Web.Contents(
    "https://www.bing.com",
    [
        RelativePath = "search?q=Power+Query"
    ]
)
  1. With RelativePath and the Query option
Web.Contents(
    "https://www.bing.com",
    [
        RelativePath = "search",
        Query = [q = "Power Query"]
    ]
)

And here is an example using a POST query.

let
    headers = [#"Content-Type" = "application/json"],
    postData = Json.FromValue(
        Json.Document(
           "{
               ""queryInfo"": {
               ""problemType"": ""Polynomial"",
               ""stepTypes"": [
                  ""quadratic expression with fraction with 1 variables"",
                   ""1 variables""
               ],
               ""market"": ""en""
             }
           }"
      )
   ),
    relPath = "cameraexp/api/v1/generateCustomBingAnswers",
    response = Web.Contents(
        "https://mathsolver.microsoft.com/ ",
        [
            Headers = headers,
            Content = postData,
            RelativePath = relPath
        ]
    ),
    jsonResponse = Json.Document(response)
in
    jsonResponse

You may notice that GET or POST is never mentioned in the Web.Contents function. The Content property is the trigger that let the function knows we are performing a POST. You can also see that RelativePath is still usable in a POST. Even if that property is optional, it is highly recommended to use it if multiple queries are done to the same server to prevent multiple connexions to the same server.

For more information on how to use, check the official documentation.

It is good to know that it is also possible to invoke OData feeds by using the GET method. This technique allows to limit the connexion number to the same server but comes with a performance counterpart. It is also important to know that the responses will still be paginated, but would have to be managed manually using the “nextLink” field (that can be managed using List.Generate).

The technique above has to be used with caution by evaluating pros and cons. In some cases, OData.Feed is still a better option for performance matter. But if your data feeds a rather small data sets and that you don’t want to manage multiple connector credentials, here how to proceed.

Let’s use RelateivePath for the feed and query part of the URL. Take in consideration we want to perform this query:

https://monserveur.ca/path/_odata/v4.0-preview/Employee?$select=id,firstname,lastname,email

For this example, let’s consider this URL part as our base URL : https://monserveur.ca/path/_odata/v4.0-preview/ but we could also consider to stop right after path/. The important thing is to not include the feed part nor the query part. It’s also important to be sure that the concatenation of the base URL and the RelativePath give us back the full URL. Our feed is then Employee and the query is $select=id,firstname,lastname,email.

Since the OData query structure don’t allow us to use the Query option, we must put it all in RelativePath. Important to note though that if we receive a nextLink, the content of the field will be a complete URL. To avoid receiving dynamic connexion messages, we will also need to remove the base URL from that field to send it to the next RelativePath.

URL = " https://monserveur.ca/path/_odata/v4.0-preview/",
feed = "Employee",
query = "?$select=id,firstname,lastname,email ",
FeedContentFix = List.Generate(
    () => Web.Content(url, [RelativePath=feed&query]),
    each Record.HasFields(_, "value"),
    each (
        if (
            Record.HasFields(_, "@odata.nextLink")
        ) then (
            let
                nextstep = _[#"@odata.nextLink"],
                nextquery = Text.Range(nextstep,Text.PositionOf(nextstep,feed))
             in
                 Web.Content(url, [RelativePath=nextquery]),
        ) else (
             [context="EoF"]
        )
    ),
    each _[value]
),

Advantages

  • Allow the use of RelativePath
  • Can be use with most API, including OData.

Warning Points

  • Manual paging management (> 10 000)

VSTS.AccountContents

This is a Microsoft specific API connector. It’s basically the same as Web.Contents that add a “Version” property that is not supported by the latest. (that property is optional, but is present in the official Microsoft wrappers that allow us to connect to wiql resources in Azure DevOps.)

Important note. The whole VSTS namespace (including both AccountContents and Feed) will detect the organization (or subdomain) without requiring the base URL trick and will only ask one credential per organization.

Sadly, it is impossible to use VSTS in a data flow on Power BI Service yet. Nonetheless, it is usable directly in all other Power Query platform, either Excel or Power BI, and it will publish without any problems.

Something else that could be slightly irritating is that all VSTS connexion will be called VSTS without any further details. In the case you are connecting a report to two or more distinct VSTS sources, they will all have the same name without any possibility to distinguish them in the connexion management. Same goes with the workspace. If you have multiple reports that use these connectors, they will get harder to tell apart in the connexion and gateway manager.

Advantages

  • Same core advantages as Web.Contents
  • Single authentication for the whole ADO organization

Warning Points

  • Specific Microsoft services only
  • Single name for all connexions
  • Doesn’t work in data flow

VSTS.Feed

This is a Microsoft specific API connector. It’s basically the same as OData.Feed with the exact same specifications.

The same way as VSTS.AccountContents, this connector detects the Azure DevOps organization and will create a single connexion for the whole organization or subdomain (including those already existing for VSTS.AccountContents). It also has the same disadvantage as this latest.

Advantages

  • Same core advantages as OData.Feed
  • Single authentication for the whole ADO organization

Warning Points

  • Specific Microsoft services only
  • Single name for all connexions
  • Doesn’t work in data flow

Still curious?

For those who want to learn even more, follow the blog series called “Series – An agile project dashboard with Power BI” by Simon-Pierre Morin and Mathieu Boisvert.

Similar Posts