Contents
Finding duplicates in Excel can be a daunting task, but if you’re armed with some basic knowledge, you’ll find a few ways to tackle it. When I first thought about this problem, I quickly came up with a couple of ways to find duplicates, and after thinking a little, I discovered a few more ways. So, let’s look at a couple of simple ones first, and then move on to more complex methods.
The first step is that you need to bring the data into a format that makes it easy to manipulate and modify it. Creating headings in the top row and putting all the data under these headings allows you to organize the data in a list. In a word, the data turns into a database that can be sorted and various manipulations can be performed with it.
Finding Duplicates Using Excel’s Built-in Filters
By organizing the data in the form of a list, you can apply various filters to it. Depending on the dataset you have, you can filter the list by one or more columns. Since I’m using Office 2010, all I have to do is select the top line, which contains the headings, then go to the tab Data (Data) and press command Filter (Filter). Triangular arrows pointing down (drop-down menu icons) will appear next to each of the headings, as in the figure below.
Clicking one of these arrows will open a filter drop-down menu that contains all the information for that column. Select any item from this list and Excel will display the data according to your choice. This is a quick way to summarize or see the amount of data selected. You can uncheck the box Select All (Select All), and then select one or more items you want. Excel will only show the rows that contain the items you selected. This makes it much easier to find duplicates, if any.
After setting up the filter, you can remove duplicate rows, sum up subtotals, or additionally filter the data by another column. You can edit the data in the table the way you want. In the example below, I have items selected XP и XP Pro.
As a result of the filter, Excel displays only those rows that contain the items I selected (ie, people on whose computers XP and XP Pro are installed). You can choose any other combination of data, and if necessary, even set up filters on several columns at once.
Advanced filter to find duplicates in Excel
On the Advanced tab Data (Data) to the right of the command Filter (Filter) there is a button for filter settings – Advanced (Additionally). This tool is a little more difficult to use and needs to be set up a bit before it can be used. Your data must be organized as described earlier, i.e. like a database.
Before you can use an advanced filter, you must set criteria for it. Look at the figure below, it shows a list with data, and on the right in the column L criterion is specified. I have written the column heading and the criterion under the same heading. The figure shows a table of football matches. It is required that it only show home meetings. That’s why I copied the heading of the column I want to filter on, and below that I put the criterion (H) that I want to use.
Now that the criterion is set, select any cell of our data and press the command Advanced (Additionally). Excel will select the entire list of data and open the following dialog box:
As you can see, Excel has selected the entire table and is waiting for us to specify the range with the criterion. Select a field in the dialog box Criteria Range (Range of conditions), then select the cells with the mouse L1 и L2 (or those in which your criterion is located) and click OK. The table will display only those rows where in the column Home / Visitor worth the value Hand hide the rest. So we found duplicate data (one column at a time) showing only home meetings:
This is a fairly simple way to find duplicates, which can help save time and get the information you need quickly enough. It must be remembered that the criterion must be placed in a cell separate from the data list so that you can find it and use it. You can change the filter by changing the criteria (I have it in cell L2). In addition, you can turn off the filter by clicking the button. Clear (Clear) tab Data (Data) in a group Sort & Filter (Sort and filter).
Built-in tool to remove duplicates in Excel
Excel has a built-in function Remove Duplicates (Remove duplicates). You can select a data column and use this command to remove all duplicates, leaving only unique values. Take advantage of the tool Remove Duplicates (Remove duplicates) using the button of the same name, which you will find on the tab Data (Data).
Don’t forget to choose in which column you want to keep only unique values. If the data does not contain headers, then the dialog box will show Column A, Column B (column A, column B) and so on, so headings are much more convenient to work with.
When you’re done with settings, click OK. Excel will show an information window with the result of the function (an example in the figure below), in which you also need to click OK. Excel will automatically eliminate rows with duplicate values, leaving you with only the unique values in the columns you choose. By the way, this tool is present in Excel 2007 and newer versions.
Finding duplicates using the Find command
If you need to find a small number of duplicate values in Excel, you can do so with a search. Go to tab Home (Home) and click Find & Select (Find and select). A dialog box will open where you can enter any value to look up in your table. To avoid typos, you can copy the value directly from the data list.
In the case when the amount of information is very large and you need to speed up the search, select the row or column in which you want to search, and only then start the search. If this is not done, Excel will search through all the available data and find unnecessary results.
If you want to search all available data, perhaps the button Find All (Find All) will be more useful for you.
In conclusion
All three methods are easy to use and will help you find duplicates:
- Filter – ideal when there are several categories in the data that you may need to split, sum or remove. Creating subsections is the best use for an advanced filter.
- Removing duplicates will reduce the amount of data to a minimum. I use this method when I need to make a list of all the unique values of one of the columns, which I later use for vertical search using the VLOOKUP function.
- I use the command Find (Find) only if you need to find a small number of values, and the tool Find and Replace (Find and replace) when I find mistakes and want to fix them all at once.
This is by no means an exhaustive list of methods for finding duplicates in Excel. There are many ways, and these are just a few that I use regularly in my daily work.