Monday, July 29, 2013

MERGE SQL Statement

 MERGE MyTargetTable  As target
USING ( SELECT Col1, Col2, Col3
FROM MySourceTable
WHERE MySourceTable.Col1 = SomeCondition and MySourceTable.Col2= SomeCondition
GROUP BY MySourceTable.Col1)Source
ON
(
target.Col1 = Source.Col1
)
WHEN MATCHED BY target OR Col2 <> Source.Col2 OR (Col2 IS NULL AND Source.Col2 IS NULL)
THEN UPDATE
Set Col2= Source.Col2,
Col3 = Source.Col3
WHEN NOT MATCHED BY target AND Col1 = Source.Col1

THEN INSERT

(
Col1,
Col2,
Col3
)
VALUES
(
Source.Col1,
Source.Col2,
Source.Col2,
)
WHEN NOT MATCHED BY SOURCE
THEN DELETE ;  

Wednesday, July 24, 2013

SSIS: How to count number of files in a folder and then perform task to insert into database.

Question: I want to count number of files in my source folder. If there are files in my source folder, I want to insert data in my table, if not I want to send an email alert that no file was found (This is just the beginning to complicate this requirement but let's begin with this simple task and we will go to make this more complicated later)

Solution:

Step 1: Let's create some variables and use them for this task.

varSourceFolder string C:\tmp
FileCount int 0


Step 2: Open a script task in Control Flow and rename it to "Check for Files" and copy and paste this script.

'****************************************************************************************************************
' Microsoft SQL Server Integration Services Script Task
' Write scripts using Microsoft Visual Basic 2008.
' The ScriptMain is the entry point class of the script.

Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime
'Add these to system to your script
Imports System.IO
Imports System.IO.Directory

<System.AddIn.AddIn("ScriptMain", Version:="1.0", Publisher:="", Description:="")> _
<System.CLSCompliantAttribute(False)> _
Partial Public Class ScriptMain
Inherits Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase


    'Declare FileCount as Global Variable where we will store value of number of files in our source folder
    Dim FileCount As Int32
Enum ScriptResults
Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success
Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
End Enum


' The execution engine calls this method when the task executes.
' To access the object model, use the Dts property. Connections, variables, events,
' and logging features are available as members of the Dts property as shown in the following examples.
'
' To reference a variable, call Dts.Variables("MyCaseSensitiveVariableName").Value
' To post a log entry, call Dts.Log("This is my log text", 999, Nothing)
' To fire an event, call Dts.Events.FireInformation(99, "test", "hit the help message", "", 0, True)
'
' To use the connections collection use something like the following:
' ConnectionManager cm = Dts.Connections.Add("OLEDB")
' cm.ConnectionString = "Data Source=localhost;Initial Catalog=AdventureWorks;Provider=SQLNCLI10;Integrated Security=SSPI;Auto Translate=False;"
'
' Before returning from this method, set the value of Dts.TaskResult to indicate success or failure.
'
' To open Help, press F1.

Public Sub Main()
'
' Add your code here
        '
        'foldername is declared as string variable and it look in varSourceFolder place to find all our files.
        Dim foldername As String = Dts.Variables("User::varSourceFolder").Value
        ' dirs get all the files based on .csv file type in our varSourceFolder location.
        Dim dirs As String() = System.IO.Directory.GetFiles(foldername, "*.csv")
        'This count all the file and save it fileCount variable which we declared at global label.
        FileCount = dirs.Length
        'MessageBox.Show(FileCount) show number of file in our folder. This is done during development. Once you are done with this, you can comment out or remove for your script.
        MessageBox.Show(FileCount)
        ' Finally we write the FileCount to our variable which we can access outside this script task.
        Dts.Variables("User::FileCount").Value = FileCount

Dts.TaskResult = ScriptResults.Success
End Sub

End Class
'****************************************************************************************************************

Step 3: Send a mail if there is no file in our source folder.

Drag a send mail task and connect to scrip task.
Double click connector to edit it.
Under Constraint option,  choose expression as Evalution operation.
Then under Expression write
@FileCount ==0 and test the connection. Remember this is the out variable from script task where we are saving file count.
Configure your send mail according to your smtp server and email from and to and message body.


Step 4: Drag and drop Foreach Loop container to your control flow. Configure your Foreach Loop. ( make sure you use expression for directory and file spec!!! it save life later on)

Step 5. Drag and Drop data flow task inside Foreach Loop and configure it according to your need.



Happy Hunting in SSIS.

Wednesday, July 10, 2013

Finding duplicate rows of data in a table

Many time, as we develop Stored Procedures or Inserting, Updating or Deleting data from a table, we some how come across this question.  Does my Merge Statement inserting duplicating row (you never now if one of column is NULL) based on your condition (it has happened few times to me!!!)

You will come across this SQL Statement which will give you a duplicate row in a table if it exists.

SELECT Column_Name, COUNT(*) TotalCountofRow
FROM Table_Name
GROUP BY Column_Name
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC;

Let's go little in depth on this statement.

SELECT COUNT(*) TotalCount
FROM Table_Name


What this statement return is number of total row in a table. Let's say we have some identity column in this table (ID) and few other unique columns.

 Now what if we are inserting data in this table using a stored procedure where we have set INSERT_IDENTITY  Table_Name ON and INSERT_IDENTITY Table_Name OFF.

And we are not aware of this situation.

In this situation duplicate data may enter our given table. How do we find out?

Let's assume that we have ID, and JobID as distinct values. And we want to see if there is any data have these two column with duplicate row of data.

SELECT JobID, COUNT(*) TotalCountofRow
FROM Table_Name
GROUP BY JobID
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC;

If duplicate row exists, TotalCountofRow will be not 0.