Wednesday, July 10, 2013

Finding duplicate rows of data in a table

Many time, as we develop Stored Procedures or Inserting, Updating or Deleting data from a table, we some how come across this question.  Does my Merge Statement inserting duplicating row (you never now if one of column is NULL) based on your condition (it has happened few times to me!!!)

You will come across this SQL Statement which will give you a duplicate row in a table if it exists.

SELECT Column_Name, COUNT(*) TotalCountofRow
FROM Table_Name
GROUP BY Column_Name
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC;

Let's go little in depth on this statement.

SELECT COUNT(*) TotalCount
FROM Table_Name


What this statement return is number of total row in a table. Let's say we have some identity column in this table (ID) and few other unique columns.

 Now what if we are inserting data in this table using a stored procedure where we have set INSERT_IDENTITY  Table_Name ON and INSERT_IDENTITY Table_Name OFF.

And we are not aware of this situation.

In this situation duplicate data may enter our given table. How do we find out?

Let's assume that we have ID, and JobID as distinct values. And we want to see if there is any data have these two column with duplicate row of data.

SELECT JobID, COUNT(*) TotalCountofRow
FROM Table_Name
GROUP BY JobID
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC;

If duplicate row exists, TotalCountofRow will be not 0.




No comments:

Post a Comment