Using the VLOOKUP function in Excel: Fuzzy Match

We recently dedicated an article to one of the most useful Excel functions called VPR and showed how it can be used to extract the required information from a database into a worksheet cell. We also mentioned that there are two use cases for the function VPR and only one of them deals with database queries. In this article, you will learn another lesser known way to use the function VPR in Excel.

If you have not done this yet, then be sure to read the last article about the function VPR, because all the information below assumes that you are already familiar with the principles described in the first article.

When working with databases, functions VPR a unique identifier is passed, which is used to identify the information we want to find (for example, a product code or a customer identification number). This unique code must be present in the database, otherwise VPR will report an error. In this article, we will look at this way of using the function VPRwhen the id doesn’t exist in the database at all. As if the function VPR switched to the approximate mode, and chooses what data to provide us when we want to find something. In certain circumstances, this is exactly what is needed.

An example from life. We set the task

Let’s illustrate this article with a real-life example – calculating commissions based on a wide range of sales metrics. We will start with a very simple option, and then we will gradually complicate it until the only rational solution to the problem is to use the function VPR. The initial scenario for our fictitious task is as follows: if a salesperson makes more than $30000 in sales in a year, then his commission is 30%. Otherwise, the commission is only 20%. Let’s put it in the form of a table:

The seller enters their sales data in cell B1, and the formula in cell B2 determines the correct commission rate that the seller can expect. In turn, the resulting rate is used in cell B3 to calculate the total commission the seller should receive (simply multiplying cells B1 and B2).

The most interesting part of the table is contained in cell B2 – this is the formula for determining the commission rate. This formula contains an Excel function called IF (IF). For those readers who are not familiar with this function, I will explain how it works:

IF(condition, value if true, value if false)

ЕСЛИ(условие; значение если ИСТИНА; значение если ЛОЖЬ)

Condition is a function argument that takes the value of either TRUE CODE (TRUE), or FALSE (FALSE). In the example above, the expression B1

Is it true that B1 is less than B5?

Or you can say it differently:

Is it true that the total amount of sales for the year is less than the threshold value?

If we answer this question YES (TRUE), then the function returns value if true (value if TRUE). In our case, this will be the value of cell B6, i.e. commission rate when total sales are below the threshold. If we answer the question NO (FALSE) then returns value if false (value if FALSE). In our case, this is the value of cell B7, i.e. commission rate when total sales are above the threshold.

As you can see, if we take the total sales of $20000, we get a 2% commission rate in cell B20. If we enter a value of $40000, then the commission rate will change by 30%:

This is how our table works.

We complicate the task

Let’s make things a little more difficult. Let’s set another threshold: if the seller earns more than $40000, then the commission rate increases to 40%:

Everything seems to be simple and clear, but our formula in cell B2 becomes noticeably more complicated. If you look closely at the formula, you will see that the third argument of the function IF (IF) turned into another full-fledged function IF (IF). This construction is called nesting of functions into each other. Excel happily allows these constructs, and they even work, but they’re much harder to read and understand.

We will not delve into the technical details – why and how it works, and we will not go into the nuances of writing nested functions. After all, this is an article dedicated to the function VPR, not a complete guide to Excel.

Whatever the case, the formula gets more complicated! What if we introduce another option for a commission rate of 50% for those sellers who make more than $50000 in sales. And if someone has sold more than $60000, will they pay 60% commission?

Now the formula in cell B2, even if it was written without errors, has become completely unreadable. I think that there are few who want to use formulas with 4 levels of nesting in their projects. There must be an easier way?!

And there is such a way! The function will help us VPR.

We apply the VLOOKUP function to solve the problem

Let’s change the design of our table a bit. We will keep all the same fields and data, but arrange them in a new, more compact way:

Take a moment and make sure the new table Rate Table includes the same data as the previous threshold table.

The main idea is to use the function VPR to determine the desired tariff rate according to the table Rate Table depending on sales volume. Please note that the seller can sell goods for an amount that is not equal to one of the five thresholds in the table. For example, he could sell for $34988, but there is no such amount. Let’s see how the function VPR can deal with such a situation.

Inserting a VLOOKUP function

Select cell B2 (where we want to insert our formula) and find VLOOKUP (VLOOKUP) in the Excel Functions Library: Formulas (formulas) > Function Library (Function Library) > Lookup & Reference (References and arrays).

A dialog box appears Function Arguments (Function arguments). We fill in the values ​​of the arguments one by one, starting with Lookup_value (Lookup_value). In this example, this is the total amount of sales from cell B1. Put the cursor in the field Lookup_value (Lookup_value) and select cell B1.

Next, you need to specify the functions VPRwhere to look for data. In our example, this is a table Rate Table. Put the cursor in the field Table_array (Table) and select the entire table Rate Tableexcept for headers.

Next, we need to specify which column to extract data from using our formula. We are interested in the commission rate, which is in the second column of the table. Therefore, for the argument Col_index_num (Column_number) enter the value 2.

And finally, we introduce the last argument – Range_lookup (Interval_lookup).

Important: it is the use of this argument that makes the difference between the two ways of applying the function VPR. When working with databases, the argument Range_lookup (range_lookup) must always have a value FALSE (FALSE) to search for an exact match. In our use of the function VPR, we must leave this field blank, or enter a value TRUE CODE (TRUE). It is extremely important to choose this option correctly.

To make it clearer, we will introduce TRUE CODE (TRUE) in the field Range_lookup (Interval_lookup). Although, if you leave the field blank, this will not be an error, since TRUE CODE is its default value:

We have filled in all the parameters. Now we press OK, and Excel creates a formula for us with a function VPR.

If we experiment with several different values ​​for the total sales amount, then we will make sure that the formula works correctly.

Conclusion

When the function VPR works with databases, argument Range_lookup (range_lookup) must accept FALSE (FALSE). And the value entered as Lookup_value (Lookup_value) must exist in the database. In other words, it’s looking for an exact match.

In the example we’ve looked at in this article, there’s no need to get an exact match. This is the case when the function VPR must switch to approximate mode to return the desired result.

For example: We want to determine what rate to use in the commission calculation for a salesperson with a sales volume of $34988. Function VPR returns us a value of 30%, which is absolutely correct. But why did the formula select the row containing exactly 30% and not 20% or 40%? What is meant by approximate search? Let’s be clear.

When the argument Range_lookup (interval_lookup) has a value TRUE CODE (TRUE) or omitted, function VPR iterates through the first column and selects the largest value that does not exceed the lookup value.

Important point: For this scheme to work, the first column of the table must be sorted in ascending order.

Leave a Reply