Contents
In this article, you will learn how to quickly merge data from two Excel tables when there are no exact matches in key columns. For example, when the unique identifier from the first table is the first five characters of the identifier from the second table. All solutions suggested in this article have been tested by me in Excel 2013, 2010 and 2007.
So, there are two Excel sheets that need to be combined for further data analysis. Suppose one table contains prices (column Price) and descriptions of goods (column Beer) that you sell, and the second table contains data on the availability of goods in stock (column In stock). If you or your colleagues compiled both catalog tables, then both must contain at least one key column with unique product identifiers. The product description or price may change, but the unique identifier always remains the same.
Difficulties begin when you receive some tables from the manufacturer or from other departments of the company. Things can get even more complicated if a new format of unique identifiers is suddenly introduced, or if stock nomenclature designations (SKUs) change just a little. And you are faced with the task of combining new and old tables with data in Excel. One way or another, a situation arises when there is only a partial match of records in the key columns, for example, “12345” and “12345-new_suffix“. You understand that this is the same SKU, but the computer is not so quick-witted! This non-exact match makes it impossible to use regular Excel formulas to combine data from two tables.
And what’s really bad is that the correspondences can be completely fuzzy, and “Some company” in one table can turn into “CJSC “Some Company”” in another table, and “New Company (formerly Some Company)” and “Old Company” will also turn out to be a record about the same company. You know this, but how do you explain it to Excel?
There is always a way out, read on and you will find out the solution!
Note: The solutions described in this article are universal. You can adapt them for further use with any standard formulas such as VPR (VLOOKUP), MORE EXPOSED (MATCH), GPR (HLOOKUP) and so on.
Choose the appropriate example to jump straight to the right solution:
A key column in one of the tables contains additional characters
Consider two tables. The columns of the first table contain the item number (SKU), beer name (Beer) and its price (Price). The second table contains the SKU and the number of bottles in stock (In stock). Instead of beer, there can be any product, and the number of columns in real life can be much larger.
In the table with additional symbols, we create an auxiliary column. You can add it to the end of the table, but it’s best to insert it next to the right after the key column so that it’s visible.
The key in the table in our example is the column A with the SKU data, and you need to extract the first 5 characters from it. Let’s add an auxiliary column and name it SKU helper:
- Hover the mouse pointer over the column heading B, while it should take the form of an arrow pointing down:
- Right-click on the title and select from the context menu Insert (Insert):
- Give the column a name SKU helper.
- To extract the first 5 characters from a column SKU, to cell B2 enter the following formula:
=ЛЕВСИМВ(A2;5)
=LEFT(A2,5)
Here A2 is the address of the cell from which we will extract characters, and 5 – the number of characters to be extracted.
- Copy this formula to all cells of the new column.
Ready! Now we have key columns with exact match values - column SKU helper in main table and column SKU in the table to be searched.
Now using the function VPR (VLOOKUP) we get the desired result:
Other formulas
- Extract first Х characters on the right: for example, 6 characters on the right from the entry “DSFH-164900”. The formula will look like this:
=ПРАВСИМВ(A2;6)
=RIGHT(A2,6)
- Skip the first Х characters, extract the following Y characters: for example, you need to extract “0123” from the entry “PREFIX_0123_SUFF”. Here we need to skip the first 8 characters and extract the next 4 characters. The formula will look like this:
=ПСТР(A2;8;4)
=MID(A2,8,4)
- Extract all characters up to the separator, the length of the resulting sequence may be different. For example, you want to extract “123456” and “0123” from the entries “123456-suffix” and “0123-suffix”, respectively. The formula will look like this:
=ЛЕВСИМВ(A2;НАЙТИ("-";A2)-1)
=LEFT(A2,FIND("-",A2)-1)
In a word, you can use Excel functions such as LEVSIMV (LEFT), RIGHT (RIGHT), PSTR (MID), TO FIND (FIND) to retrieve any parts of a composite index. If you have any difficulties with this, please contact us, we will do our best to help you.
Data from a key column in the first table is split into two or more columns in the second table
Suppose the table being searched contains a column with identifiers. The cells of this column contain records of the form XXXX-YYYYWhere XXXX is a code designation for a group of goods (mobile phones, televisions, video cameras, cameras), and YYYY is the product code within the group. The main table consists of two columns: one contains codes of commodity groups (Group), the second contains codes of goods (ID). We cannot simply discard the product group codes, since the same product code can be repeated in different groups.
We add an auxiliary column in the main table and name it Full ID (column C), see how this is done earlier in this article.
In a cell C2 we write the following formula:
=СЦЕПИТЬ(A2;"-";B2)
=CONCATENATE(A2,"-",B2)
Here A2 is the address of the cell containing the group code; symbol “—‘ is the delimiter; B2 is the address of the cell containing the product code. Copy the formula to the rest of the lines.
Now it will not be difficult to combine the data from our two tables. We will match the column Full ID first table with column ID second table. When a match is found, the entries from the columns Description и Price the second table will be added to the first table.
Data in key columns does not match
Here is an example: You are the owner of a small store, you receive goods from one or more suppliers. Each of them has its own nomenclature, which differs from yours. As a result, there are situations where your entry “Case-Ip4S-01” matches the entry “SPK-A1403” in the Excel file received from the supplier. Such discrepancies occur randomly and there is no general rule to automatically convert “SPK-A1403” to “Case-Ip4S-01”.
Bad news: The data contained in these two Excel spreadsheets will have to be processed manually in order to be able to merge them later.
Good news: This will only have to be done once, and the resulting auxiliary table can be saved for future use. Then you can merge these tables automatically and save a lot of time in this way 🙂
1. Create an auxiliary table for searching.
Create a new Excel sheet and name it SKU converter. Copying the entire column Our.SKU from a sheet Store to a new sheet, remove duplicates and leave only unique values in it.
Add a column next to it Supp.SKU and manually looking for matches between column values Our.SKU и Supp.SKU (the descriptions from the column will help us with this Description). This is a boring job, let the thought that you have to do it only once 🙂 please you.
As a result, we have the following table:
2. Update the main table with the data from the lookup table.
Insert a new column into the main table (Store sheet) Supp.SKU.
Next, using the function VPR (VLOOKUP) compare sheets Store и SKU converter, using a column to search for matches Our.SKU, and for updated data – a column Supp.SKU.
Column Supp.SKU filled in with original manufacturer’s codes.
Note: If in a column Supp.SKU empty cells appear, then you need to take all the codes SKUcorresponding to these empty cells, add them to the table SKU converter and find the corresponding code from the vendor table. After that, repeat step 2.
3. Transfer data from the lookup table to the main table
Our main table has a key column that exactly matches the elements of the lookup table, so now this task will not be difficult 🙂
Using functions VPR (VLOOKUP) concatenate sheet data Store with sheet data Wholesale Supplier 1, using a column to search for matches Supp.SKU.
Here is an example of updated data in a column Wholesale Price:
It’s simple, isn’t it? Ask your questions in the comments to the article, I will try to answer as soon as possible.