When working with Microsoft Excel, a situation often arises when it is necessary to remove duplicate rows. This process can become a pointless, monotonous and time-consuming job, however, there are a number of ways to simplify the task. Today we are going to take a look at some handy methods for finding and removing duplicate rows in Excel. Let’s take the following data table as an example:
Option 1: Remove Duplicates Command in Excel
Microsoft Excel has a built-in tool that allows you to find and remove duplicate rows. Let’s start by looking for duplicate lines. To do this, select any cell in the table, and then select the entire table by clicking Ctrl + A.
Click the tab Date (Data) and then press command Remove Duplicates (Remove Duplicates) as shown below.
A small dialog box will appear Remove Duplicates (Remove duplicates). You may notice that the selection of the first line is removed automatically. The reason for this is the checkbox set in paragraph My data has headers (My data contains headers).
In our example, there are no headers because the table starts on the 1st row. So let’s uncheck the box. By doing this, you will notice that the entire table is selected again, and the section Columns (Columns) will change from dulpicate on Column A, B и С.
Now that the entire table is selected, click OKto remove duplicates. In our case, all rows with duplicate data will be deleted, except for one. All deletion information will be displayed in a pop-up dialog box.
Option 2: Advanced filter
The second Excel tool with which you can find and remove duplicates is Advanced filter. This method also applies to Excel 2003. To apply the filter, you must select the entire table, as before, using the keyboard shortcut Ctrl + A.
Then go to the tab Data (Data), in command group Sort & Filter (Sort & Filter) click command Advanced (Optional) as shown below. If you are using Excel 2003, call the drop-down menu Data (Data), select Filters (Filters) and then Advanced Filters (Advanced filters).
Now you need to check the box Unique records only (Only unique entries).
After clicking OK all duplicates in the document will be removed, except for one entry. In our example, there are two records left because the first duplicate was found in row 1. This method automatically determines the headers in the table. If you want to delete the first line, you will have to delete it manually. When the 1st row has headers and not duplicates, only one copy of the existing repeats will remain.
Option 3: Replacement
This method is useful when you need to find duplicate rows in small tables. We will use the tool Find and Replace (Search and Replace), which is built into all Microsoft Office products. First you need to open the Excel spreadsheet you plan to work with.
With the table open, select the cell whose contents you want to find and replace, and copy it. To do this, select the desired cell and press the keyboard shortcut Ctrl + C.
After copying the word you want to find, use the combination Ctrl + Hto bring up a dialog box Find and Replace (Search and replace). Paste the copied word from the field To findby clicking Ctrl + V.
Нажмите кнопку Options (Options) to open an additional list of options. Check the box next to Match entire cell contents (Entire cell). This must be done because in some cells the search words are found together with other words. If you do not select this option, you may inadvertently delete cells that you want to keep. Make sure all other settings match those shown in the figure below.
Now you need to enter a value in the field Replace with (Replaced by). In this example, we will use the number 1. After entering the desired value, press Replace All (Replace all).
It can be seen that all values duplicate in table cells will be replaced by 1. We have used the value 1, because it is small and stands out in the text. Now you can visually identify rows that have duplicate values.
To leave one of the duplicates, just paste the original text back into the line that was replaced. In our case, we will restore the values in the 1st row of the table.
Once you have identified rows with repeating content, select them one by one by holding down the Ctrl.
After selecting all the lines you want to delete, right-click on the heading of any of the selected lines and in the context menu click Delete (Delete). Don’t press the key Delete on the keyboard, since in this case only the contents of the cells will be deleted, and not the entire row.
By doing this, you may notice that all the remaining rows have unique values.