Delete duplicates but keep first record

Author: Steven Neiland
Published:

Last week I posted about getting the most recent entry in each group of records. This week I want to expand on that by showing how you can use the same technique to remove duplicate entries but saving the first entry.

Sample Data

Again our sample data:

Id Name Favorite_Color Date_Entered
1JohnRed2019-08-01
2JohnRed2019-07-01
3JohnRed2019-09-01
3JohnBlue2019-09-01
4PeterRed2020-07-01
5PeterRed2020-08-01

Desired Result

This time instead of returning the most recent entry for each group as a resultset we want to actually delete from the table any records that are duplicated (in this case have columns the same name and color entries) but keeping the oldest entry so that our database table becomes this:

Id Name Favorite_Color Date_Entered
2JohnRed2019-07-01
3JohnBlue2019-09-01
4PeterRed2020-07-01

Solution

To get the desired result we need to identify all records that are duplicated which in this case is all duplicate combinations of the name and favorite_color columns. We partition and number the rows sorting by date_entered to place the oldest one first and then use the CTE to delete those rows numbered higher than 1.

We could also use the Id column the same way since in this case it is an identity column but the Id could just as easily be a UUID so I wanted to show using an actual date column.

WITH CTE AS (
      SELECT
            *,
            ROW_NUMBER() OVER(PARTITION BY name, favorite_color ORDER BY Date_Entered ASC) AS DuplicateCount
      FROM
            myTable
)
DELETE FROM CTE WHERE DuplicateCount > 1

What Do You Think?

Reader Comments

Mark Gregory's Gravatar
Mark Gregory
Tuesday, August 11, 2020 at 4:22:28 PM EDT

Just had one of those "ahhhh" moments, wondering why I have been doing this any of several ways that are more convoluted. Your solution is the obvious and clean one. Better than the 'count all and group, stuff in to temp table, delete from join temp table where count >1' kind of approach.

Comment Etiquette:

  • Please keep comments on-topic.
  • Please do not post unrelated questions or large chunks of code.
  • Please do not engage in flaming/abusive behaviour.
  • Comments that contain or appear to be advertisments, will not be published.
  • Comments that appear to be created for the purpose of linkbuilding to commercial sites will be removed.
*
*



Archives Blog Listing