Build tables with different headers from multiple books

Formulation of the problem

We have several files (in our example – 4 pieces, in the general case – as many as you like) in one folder Reports:

Build tables with different headers from multiple books

Inside, these files look like this:

Build tables with different headers from multiple books

Wherein:

  • The data sheet we need is always called Photos, but can be anywhere in the workbook.
  • Beyond the sheet Photos Each book may have other sheets.
  • Tables with data have a different number of rows and may start with a different row on the worksheet.
  • The names of the same columns in different tables may differ (for example, Quantity = Quantity = Qty).
  • Columns in tables can be arranged in a different order.

Task: collect sales data from all files from the sheet Photos into one common table in order to subsequently build a summary or any other analytics on it.

Step 1. Preparing a directory of column names

The first thing to do is to prepare a reference book with all possible options for column names and their correct interpretation:

Build tables with different headers from multiple books

We convert this list into a dynamic “smart” table using the Format as table button on the tab Home (Home — Format as Table) or keyboard shortcut Ctrl+T and load it into Power Query with the command Data – From Table/Range (Data — From Table/Range). In recent versions of Excel, it has been renamed to With leaves (From sheet).

In the Power Query query editor window, we traditionally delete the step Changed Type and add a new step instead of it by clicking on the button fxin the formula bar (if it is not visible, then you can enable it on the tab Review) and enter the formula there in the built-in Power Query language M:

=Table.ToRows(Source)

This command will convert the one loaded in the previous step Source reference table into a list consisting of nested lists (List), each of which, in turn, is a pair of values It was-became from one line:

Build tables with different headers from multiple books

We will need this type of data a little later, when mass renaming headers from all loaded tables.

After completing the conversion, select the commands Home — Close and Load — Close and Load in… and type of import Just create a connection (Home — Close&Load — Close&Load to… — Only create connection) and go back to Excel.

Step 2. We load everything from all files as is

Now let’s load the contents of all our files from the folder – for now, as is. Choosing teams Data – Get data – From file – From folder (Data — Get Data — From file — From folder) and then the folder where our source books are.

In the preview window, click Convert (Transform) or Change (Edit):

Build tables with different headers from multiple books

And then expand the contents of all downloaded files (Binary) button with double arrows in the column heading Content:

Build tables with different headers from multiple books

Power Query on the example of the first file (Vostok.xlsx) will ask us the name of the sheet we want to take from each workbook – choose Photos and press OK:

Build tables with different headers from multiple books

After that (in fact), several events that are not obvious to the user will occur, the consequences of which are clearly visible in the left panel:

Build tables with different headers from multiple books

  1. Power Query will take the first file from the folder (we will have it Vostok.xlsx — see File example) as an example and imports its content by creating a query Convert sample file. This query will have some simple steps like Source (file access) Navigation (sheet selection) and possibly raising the titles. This request can only load data from one specific file Vostok.xlsx.
  2. Based on this request, the function associated with it will be created Convert file (indicated by a characteristic icon fx), where the source file will no longer be a constant, but a variable value – a parameter. Thus, this function can extract data from any book that we slip into it as an argument.
  3. The function will be applied in turn to each file (Binary) from the column Content – step is responsible for this Call custom function in our query that adds a column to the list of files Convert file with import results from each workbook:

    Build tables with different headers from multiple books

  4. Extra columns are removed.
  5. The contents of nested tables are expanded (step Extended table column) – and we see the final results of data collection from all books:

    Build tables with different headers from multiple books

Step 3. Sanding

The previous screenshot clearly shows that the direct assembly “as is” turned out to be of poor quality:

  • The columns are reversed.
  • Many extra lines (empty and not only).
  • Table headers are not perceived as headers and are mixed with data.

You can fix all these problems very easily – just tweak the Convert Sample File query. All adjustments that we make to it will automatically fall into the associated Convert file function, which means they will be used later when importing data from each file.

By opening a request Convert sample file, add steps to filter unnecessary rows (for example, by column Column2) and raising the headings with the button Use first line as headers (Use first row as headers). The table will look much better.

In order for columns from different files to automatically fit under each other later, they must be named the same. You can perform such a mass renaming according to a previously created directory with one line of M-code. Let’s press the button again fx in the formula bar and add a function to change:

= Table.RenameColumns(#”Elevated Headers”, Headers, MissingField.Ignore)

Build tables with different headers from multiple books

This function takes the table from the previous step Elevated headers and renames all columns in it according to the nested lookup list Headlines. Third argument MissingField.Ignore is needed so that on those headings that are in the directory, but are not in the table, an error does not occur.

Actually, that’s all.

Returning to the request Reports we will see a completely different picture – much nicer than the previous one:

Build tables with different headers from multiple books

  • What is Power Query, Power Pivot, Power BI and why an Excel user needs them
  • Collecting data from all files in a given folder
  • Collecting data from all sheets of the book into one table

 

Leave a Reply