Dividing sticky text with the FILTER.XML function

Contents

More recently, we discussed the use of the FILTER.XML function to import XML data from the Internet – the main task for which this function, in fact, is intended. Along the way, however, another unexpected and beautiful use of this function has surfaced – for quickly dividing sticky text into cells.

Let’s say we have a data column like this:

Dividing sticky text with the FILTER.XML function

Of course, for convenience, I would like to divide it into separate columns: company name, city, street, house. You can do this in a bunch of different ways:

  • Use Text by columns from the tab Data (Data — Text to columns) and go three steps Text parser. But if the data changes tomorrow, you will have to repeat the whole process again.
  • Load this data into Power Query and divide it there, and then upload it back to the sheet, and then update the query when the data changes (which is already easier).
  • If you need to update on the fly, then you can write some very complex formulas to find commas and extract the text between them.

And you can do it more elegantly and use the FILTER.XML function, but what does it have to do with it?

The FILTER.XML function receives as its initial argument an XML code — text marked up with special tags and attributes, and then parses it into its components, extracting the data fragments we need. The XML code usually looks something like this:

Dividing sticky text with the FILTER.XML function

In XML, each data element must be enclosed in tags. A tag is some text (in the example above it is manager, name, profit) enclosed in angle brackets. Tags always come in pairs – opening and closing (with a slash added to the beginning).

The FILTER.XML function can easily extract the contents of all the tags we need, for example, the names of all managers, and (most importantly) display them all at once in one list. So our task is to add tags to the source text, turning it into XML code suitable for subsequent analysis by the FILTER.XML function.

If we take the first address from our list as an example, then we will need to turn it into this construction:

Dividing sticky text with the FILTER.XML function

I called the global opening and closing all text tag t, and the tags framing each element are s., but you can use any other designations – it does not matter.

If we remove indents and line breaks from this code – completely, by the way, optional and added only for clarity, then all this will turn into a line:

Dividing sticky text with the FILTER.XML function

And it can already be relatively easily obtained from the source address by replacing commas in it with a couple of tags using the function SUBSTITUTE (SUBSTITUTE) and gluing with the symbol & at the beginning and end of the opening and closing tags:

Dividing sticky text with the FILTER.XML function

To expand the resulting range horizontally, we use the standard function TRANSP (TRANSPOSE), wrapping our formula in it:

Dividing sticky text with the FILTER.XML function

An important feature of this whole design is that in the new version of Office 2021 and Office 365 with support for dynamic arrays, no special gestures are required for input – just enter and click on Enter – the formula itself occupies the number of cells it needs and everything works with a bang. In previous versions, where there were no dynamic arrays yet, you will need to first select a sufficient number of empty cells before entering the formula (you can with a margin), and after creating the formula, press the keyboard shortcut Ctrl+Shift+Enterto enter it as an array formula.

A similar trick can be used when separating text stuck together into one cell through a line break:

Dividing sticky text with the FILTER.XML function

The only difference with the previous example is that instead of a comma, here we replace the invisible Alt + Enter line break character, which can be specified in the formula using the CHAR function with code 10.

  • The subtleties of working with line breaks (Alt + Enter) in Excel
  • Divide text by columns in Excel
  • Replacing text with SUBSTITUTE

Leave a Reply