multiple source connection

Power Query - Data grouping

As part of theAgile tour Québec 2023I presented solutions to the problems associated with Excel reports as presented by Mathieu Boisvert.

In this presentation, I mentioned that one of the main problems was that information was not centralized. Indeed, in order to track projects properly, it is often necessary to connect to an ERP, project tracking software (Jira, Azure DevOps or other), pipelines, a version management system and so on. What's more, these different sources aren't always compatible in terms of formats (dates, times, numbers, text encoding, regional settings, languages), not to mention the fact that some fields are free text (that the user can freely enter whatever they want), which adds another level of error source. What's more, manual extraction management in Excel tends to use out-of-date data, and requires manual redoing every time the report is updated.

Acquisition with Power Query

The solution is to use a data model and acquire the dataset automatically and centrally using Power Query. Other posts will follow to discuss optimization with Power BI, but bear in mind that you can use Power Query in both Excel and Power BI. So it's possible to make the transition in more than one step if your goal is to eventually use Power BI.

Power Query uses the M language, a programming language that can be used to connect to virtually any data source. Over a hundred sources are already available, not counting generic connectors. Once the connection has been established, you can :

  • ask Power Query to perform operations;
  • format data so that a date is a date;
  • indicate how to handle errors;
  • relate data (which will be complemented by the semantic model's relationship tools)
  • Add local operations with DAX (Power BI) or VBA (Excel)

In the case of known connectors, format transformation is generally handled by the server before receiving the data, but it is always possible to modify what has been received.

Data relationship models make it possible to create relationships that would otherwise be impossible to manage for these disparate sources.

By putting your data into a dataset model and giving Power Query your data transformation logic (we could even incorporate some business intelligence), you'll no longer have to do these operations manually. What's more, with a simple click of the "Refresh" button, your data will be up to date, saving many hours of report management.

In the next few posts, I'll explain in more technical terms how to optimize connections to REST API and OData sources, how to use data feeds to improve overall report performance, as well as M-tricks for creating a connection wrapper to limit the number of authentication requests. I'll also give you the pros and cons of each method so you can make an informed decision about when to use one or other of these tricks.

Still curious?

For those who wish to continue and learn more, I invite them to follow the series entitled " Series - Creating an agile dashboard with Power BI "by Simon-Pierre Morin and Mathieu Boisvert.