Recently, a friend of mine approached me with a request to help with the generation of all possible phrases consisting of a set of given words. Problems of this kind can arise when compiling lists of keywords and phrases for online advertising and SEO promotion, when you need to go through all the possible permutations of words in a search query:
In mathematics, this operation is called Cartesian product. The official definition is as follows: the Cartesian product of sets A and B is the set of all pairs, the first component of which belongs to set A, and the second component belongs to set B. Moreover, the elements of sets can be both numbers and text.
Translated into human language, this means that if in set A we have, for example, the words “white” and “red”, and in set B “BMW” and “Mercedes”, then after the Cartesian product of these two sets we get on the output is the set of all possible variants of phrases, made up of the words of both lists:
- white bmw
- red bmw
- white Mercedes
- red mercedes
… i.e. just what we need. Let’s look at a couple of ways to solve this task in Excel.
Method 1. Formulas
Let’s start with formulas. Let’s assume that as initial data we have three lists of original words in columns A, B and C, respectively, and the number of elements in each list can vary:
First, let’s make three columns with indices, i.e. ordinal numbers of words from each list in all possible combinations. The first row of units (E2:G2) will be entered manually, and for the rest we will use the following formula:
The logic here is simple: if the index in the superior previous cell has already reached the end of the list, i.e. is equal to the number of elements in the list calculated by the function COUNT (COUNTA), then we restart the numbering. Otherwise, we increase the index by 1. Pay special attention to the clever fixing of the ranges with dollar signs ($) so that you can copy the formula down and to the right.
Now that we have the ordinal numbers of the words we need from each list, we can extract the words themselves using the function INDEX (INDEX) into three separate columns:
If you have not come across this function in your work before, then I strongly advise you to study it at least diagonally – it helps out in many situations and is useful no less (and even more!) VPR (VLOOKUP).
Well, after that, it remains only to glue the resulting fragments line by line using the concatenation symbol (&):
… or (if you have the latest version of Excel) with the handy function COMBINE (TEXTJOIN), which can glue the entire contents of the specified cells through a given separator character (space):
Method 2. Through Power Query
Power Query is a powerful add-in for Microsoft Excel that performs two main tasks: 1. loading data into Excel from almost any external source, and 2. all kinds of transformations of loaded tables. Power Query is already built into Excel 2016-2019, and for Excel 2010-2013 it is installed as a separate add-in (you can download it from the official Microsoft website for free). If you have not yet started using Power Query in your work, then it’s time to think about it, because transformations like the ones described above are done there easily and naturally, in just a couple of movements.
First, let’s load the source lists as separate queries in Power Query. To do this, for each table, perform the following steps:
- Let’s turn tables into “smart” ones with a button Format as a table tab Home (Home — Format as Table) or keyboard shortcut Ctrl+T. Each table will automatically be given a name Table1,2,3…, which, however, can be changed if desired on the tab Constructor (Design).
- Having set the active cell in the table, press the button From the table (From Table) tab Data (Date) or on the tab Power Query (if you have it installed as a separate add-in for Excel 2010-2013).
- In the query editor window that opens, select the command Home — Close and Load — Close and Load in… (Home — Close&Load — Close&Load to..) and then the option Just create a connection (Create only connection). This will leave the loaded table in memory and allow it to be accessed in the future.
If you do everything correctly, then the output in the right panel should be three requests in the mode Connection only with our table names:
Now right-click on the first query and select the command Link (Reference)to make an updatable copy of it, and then add an additional column to the data via the command Adding a column ž – Custom column (Add Column - Custom Column). In the formula input window, enter the name of the new column (for example, Fragment2) and an extremely simple expression as a formula:
=Table2
… i.e., in other words, the name of the second query:
After clicking on OK we will see a new column, in each cell of which there will be a nested table with phrases from the second table (you can see the contents of these tables if you click in the background of the cell next to the word Table):
It remains to expand all the contents of these nested tables using the button with double arrows in the header of the resulting column and unchecking Use original column name as prefix (Use original column name as prefix):
… and we get all possible combinations of elements from the first two sets:
Further, everything is similar. Add another calculated column with the formula:
=Table3
…, and then expand the nested tables again – and now we already have all the possible options for permuting words from the three sets, respectively:
It remains to select all three columns from left to right, holding Ctrl, and concatenate their contents separated by spaces using the command Merge columns (Merge Columns) from the tab Transformation (Transform):
The resulting results can be unloaded back onto the sheet with the already familiar command Home — Close and Load — Close and Load in… (Home — Close&Load — Close&Load to..):
If in the future something changes in our source tables with fragments, then it will be enough just to update the generated query by right-clicking on the resulting table and choosing the command Update & Save (Refresh) or by pressing the keyboard shortcut Ctrl+Alt+F5.
- What is Power Query, Power Pivot, Power Map and Power BI and why do they need an Excel user
- Creating a Gantt Chart in Power Query
- 5 ways to use the INDEX function