Reintroducing SQL in easy way: 2013

Thursday, December 19, 2013

SSIS package validation while deploying to TEST or UAT or Production Server

Let's say you have created a great package in development environment and you have configured it and deployed on Test Server or UAT server. Now before you want Tester to know that package is out, You want to make sure that package would run as it was intended to do.

So How do we validate a SSIS package on a Test Server before it is being executed.

Let's dive in:

Step 1. Log on to your Test Integration Server and go to the package you want to validate.

Step 2. Right Click on the package and click Run Package.

Step 3. Go to Execution Options and check "Validate package without executing" option.

Step 4. Now Go to Command Line and click "Edit the command line Manually". In the command right at the end after /REPORTING V add " > results.txt". Add your 32-bit execution path of package at the beginning.

C:\PROGRA~2\MICROS~1\100\DTS\Binn\dtexec.exe

So it will look like something this

C:\PROGRA~2\MICROS~1\100\DTS\Binn\dtexec.exe /DTS "\DEV\mypackagename" /SERVER SERVERNAME /CHECKPOINTING OFF /REPORTING V > results.txt

Step 5. Go to your command prompt where your server is install and paste the above text. And hit enter

Step 6. Type following " notepad results.txt" and hit enter

As soon as the validation is done, a notepad will open with the validation result.

At the end you will value 0 or 1. If it is 0, it mean that you package will execute without any metadata failure.
Remember that this is just validation. It won't check data in your databuffer or any other thing. It will just validate the package.

If you need more help on this one. Just leave a comment.

SQL To remove first few character from query

Suppose you have to write a query to remove first few character in a column on a SQL Server table.

Col A Col B
1 CM NewData
2 CM Old Data
3 CM My Data

And we want to remove "CM " cm with space from our result set or use this query to update the column itself.

There are four ways to do this.

Method 1
Substring function: We need three information column name , starting position and ending position. In our example we know column name, starting positing (4th character here) and I can put any ending length would be column length

SELECT SUBSTRING(ColB, 4,1000)
FROM [dbo].[MyTable]
WHERE ColB LIKE 'cm %'

Method 2

Use of RIGHT FUNTION: Here we need column name, total length of column minus what length we need to remove on the right side.

SELECT RIGHT(colB, LEN(colB) - 4)
FROM [dbo].[MyTable]
WHERE ColB LIKE 'cm %'

Method 3

Replace Method. In Replace function, we take column name, what we are looking for and what we need to replace with it. Make sure that the pattern you want to replace does not repeat in same string. Otherwise it would replace them too.

SELECT REPLACE(ColB, 'CM ','')
FROM [dbo].[MyTable]
WHERE ColB LIKE 'cm %'

Method 4

STUFF Method. Stuff method is another way to replace character in a column.

SELECT Stuff(ColB, 1, 4, '')
FROM [dbo].[MyTable]

WHERE ColB LIKE 'cm %'

All of the above method has same execution result and cost factor. Depending on your need, you can use which ever fits your bill.

:)

Monday, December 9, 2013

Moving Large Table to Different File Group

--Moving Large Table to Different File Group

--Step 1. Find the space occupied by tabe which needs to be moved to file group.

sp_spaceused 'ACCident' -- This will list table name with rows, reserved data, index size and unused space

--Step 2. Create a new filegroup by altering database.

ALTER Database Databasename add filegroup [SecondaryData] -- This will add a new filegroup

--Step 3. Create a file and point to this new filegroup.

ALTER DATABASE Databasename
ADD FILE
( NAME = XFILENAME,
FILENAME = 'new path\SALES.ndf',
SIZE = 1MB,
MAXSIZE = 10MB,
FILEGROWTH = 1MB)
TO FILEGROUP [SECONDERYDATA]
GO

--Step 4. Alter the table to change index on the table

ALTER TABLE [BIGTABLE]
DROP CONSTRAINT [BIGTABLE_PK] WITH (MOVE TO SECONDERYDATA)

--Step 5. Recreate the index

ALTER TABLE [BIGTABLE]
ADD CONSTRAINT [BIGTABLEE_PK] PRIMARY KEY CLUSTERED
( [column name] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [SECONDERYDATA]

Tuesday, November 26, 2013

Slowly Changing Dimension and SSIS

Slowly Changing Dimension and SSIS

Concept: Slowly changing Dimension (SCD) is a concept in which data (columns) values changes over time due to change in business scenirio. Let's say we have a product with a price tag of $10 as of today date. Let's assume that the price of this product is increased by 10 percent tomorrow. So the new price will be $11. This is an example where data has changed and its called Slowly Changing Dimension.

Types of SCD: There are three type of SCD. Type 1, Type 2, and Type 3.

Type 1. In our example, we talked about price increase from $10 to $11. Let's see how we can make these changes in our database table to handle Type1 change.

Let's say we have a table called product.

ProductID ProductName ProductDescr Price DateAdded
1 iPhone Apple iPhone 1 10 01/01/2000

Now as the price of iPhone is increased from $10 to $11, all we have to do is update the price column. This is Type 1 SCD.

ProductID ProductName ProductDescr Price DateAdded
1 iPhone Apple iPhone 1 11 01/01/2000

To accomplish this, all you have to do is update the column and set with new price.

Let's take a look sql statement to do this.

Update product
Set Price = 11
Where ProductID = 1

Another way to do this:

Merger INTO TargetTable As Target
USING Sourcetable AS Source
ON Target.ProductID = Source.ProductID
When Matched THEN
Update
Set Price = Source.Price
When Not Matched THEN
INSERT
(ProductID, ProductName, ProductDescr, Price, DateAdd)
Values
(Source.ProductID, Source.ProductName, Source.ProductDescr, Source.DateAdd)

Type 2. Type 2 SCD is where we keep old record (row in this case) and add a new row. But before we do that, let's think for a moment, when we display, both rows will appear in our result. Inorder to accomplish type 2 SCD, we will make some changes to our table. We will add a new column called IsActive and RetiredDate. Also remember that when we build our product table, we kept ProductID as primary key. So when we add another row with the same productID, it's a primary key voliation. So we have to change our index by including ProductID and IsActive column as a clustered index.

With this we can add another row of data.

ProductID ProductName ProductDescr Price DateAdded RetiredDate IsActive
1 iPhone Apple iPhone 1 10 01/01/2000 01/01/2013 0
1 iPhone Apple iPhone 1 11 01/02/2013 1

In this way, we can see how and when values has changed over time.

Type 3. In Type 3 SCD, instead of add a new row, we add a new column OLD PRICE and set this column to price and update price column to new value

ProductID ProductName ProductDescr Price OLD PRICE DateAdded
1 iPhone Apple iPhone 1 11 10 01/01/2000

However, as you can see that in Type 3 SCD, if we increase our price to $12 and update our data, it will look like this.

ProductID ProductName ProductDescr Price OLD PRICE DateAdded
1 iPhone Apple iPhone 1 12 11 01/01/2000

Now the old price is $11 which is actually last price and so the $10 price data is lost forever.

In real world, most company implement Type2 so that it can be used to reporting and analysis purpose.

This is work in Progress, I will be adding more information to this blog on SCD.

Thursday, November 21, 2013

Shrinking Log file of Database in Production Environment

Let's take a look at how we shrink our log files in sql server environment.

Any database on sql server has two type of files.
1. .mdf files
2. ldf files

MDF files are transitional data files. There is one primary mdf files and it can have many secondary files. Similarly ldf are log transitional files with one primary and many secondary).

Step 1. Lets check all the log space for all the database on a server. To do this run

DBCC SQLPERF (logspace);

What this return is DatabaseName, LogSize ( this is combined logsize if you have multiple log file for a particular database), LogSpace used, and status.

Step 2. Find the database whose log file you want to shrink. let's say I want to shrink tempdb datbase log file (not the best idea but for explanation purpose, its good enough).

So we run

SP_HELPDB DatabaseName;

SP_HELPDB tempdb

So this will return you all the mdf, ldf and ndf file information, where it is located, whats the size (in KB), etc. You want to make sure to check the size of log files in GB or MB. To do so run Select "size of the file/1024" to give in MB and "size of the file/1024/1024" in GB

--Convert KB to GB to check size of log files
SELECT 111111111/1024/1024

Step 3.

To your database from Object Explorer, select the database where you want to shrink log file, Right click Task, then shrink and then files.

Select file type as log and select "Released unused space".

You can click Ok or generate a script.

USE [tempdb]
GO
DBCC SHRINKFILE (N'Templog' , 0, TRUNCATEONLY)
GO

This is how you shrink log files

Wednesday, November 20, 2013

Passing WildCard inside a parameter in Stored Procedure

While writing a stored procedure, sometime we are asked to return all the row if we do not provide a parameter.

For example.

Let's say we have a table called [State] which hold values like this.

ID StateFullName StateCode
1 Alaska AK
2. Alabama AL
.
.
.
.
.
and like

So we want to write a stored procedure where sometime we are providing code to get state full name and sometime not.

In that case our SP will be something like this.

Create Proc as usp_getStateCode
(@statecode varchar(10) = NULL)
AS
Select
StateFullName, StateCode
From dbo.[State]
Where stateCode = ISNULL(@statecode,StateCode)

Now let's say we have a situation where we want to find a name of database and we do not know fullname of database.

Let's say we have a table which list all my database name and instance name.

DatabaseInstance table with following structure
(instancename varchar(128)
databasename varchar(128)

Now we want to write a stored procedure which should give me all the values if I am not passing any value or if I am passing a values which is not complete, it should compare with wildcard match and return me those matching values.

Create Proc as usp_getDatebaseName
(@DatabaseName varchar(128) = NULL
AS
Select
InstanceName, DatabaseName
From DatabaseInstance
Where DatabaseName Like ISNULL('%' + @databaseName + '%',DatabaseName)

This stored procedure has 70% cost as it is doing full scan on table for database name as we are passing wildcard

This stored procedure will return all the values if none is provided and it will return those matching values if provided.

Another way to write this is:

Create PROCEDURE [dbo].[usp_getDatebaseName]
@databaseName VARCHAR(128) = NULL
AS
IF @databasename <> ''
SET @databaseName = '%' + @databaseName + '%'

IF @databaseName IS NULL
SET @databasename = '%'
Select
InstanceName, DatabaseName
From DatabaseInstance
Where DatabaseName Like ISNULL(@databaseName,DatabaseName)

This stored procedure take 30% scan as it is doing seek scan.

Tuesday, November 5, 2013

SQL Server: How to modify existing table structure

Suppose you have a table in a production database and you need to modify table structures.

Let's say we have a table with following columns and properties

TableA

Columns:
ID int
FName varchar(50)
LName varchar(50)
version nvarchar(10)
file ntext
City char(10)

say we have primary key on ID

Now someone asked to change version nvarchar data type to varchar datatype and file ntext to text datatype. Also remember that we have data in this table.

So how would we modify the structure of table without change any values.

Here's How:

--Step 1. Drop any primary key if there is one.
--Step 2. Drop any index if it exists (may not have to do if step cover it).
--Step 3. Alter table--change column, insert or remove.
--step 4. Recreate primary key
--step 5. Recreate index.

--Step 1. Drop any primary key if there is one.

ALTER TABLE [dbo].[TableA]

DROP CONSTRAINT constraintname_PK

;

--Step 3. Alter table--change column, insert or remove.

ALTER TABLE [dbo].[TableA]

ALTER COLUMN [version ] [varchar](10) ;

when you try to convert ntext to text using above statement, it will give an error. To do that first convert to varchar(max) and then to text data type.

ALTER TABLE TableA

ALTER COLUMN file VARCHAR(MAX)

ALTER TABLE TableA

ALTER COLUMN file text

ALTER TABLE [dbo].[TableA] ADD CONSTRAINT [constraintname_PK] PRIMARY KEY NONCLUSTERED

(

[ID ] ASC,

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

Tuesday, October 29, 2013

SSIS: Script task to copy files from one location to another location

Script task to copy files from one folder to another folder.

In SSIS, we can use file system task to copy and move files from one folder to another folder.

However you can accomplish same task using script task. Here's how you can do.

/*
Microsoft SQL Server Integration Services Script Task
Write scripts using Microsoft Visual C# 2008.
The ScriptMain is the entry point class of the script.
*/

using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using System.IO;

namespace ST_fbfcdb1278d0459f9a8c35e0b0ab16de.csproj
{
[System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{

#region VSTA generated code
enum ScriptResults
{
Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
};
#endregion

// This is where my source files are located.
string sourceDir = @"C:\Data\";
// This is where my target files will be copied
string target = @"C:\Archive\";

public void Main()
{
string[] sDirFiles = Directory.GetFiles(sourceDir);
if (sDirFiles.Length > 0)
{
foreach (string file in sDirFiles)
{
//string[] splitFile = file.Split('\\');
string copyFile = Path.GetFileName(file);
string source = sourceDir + "\\" + copyFile;
Copy(source, target + "\\" + copyFile);
}
}
Dts.TaskResult = (int)ScriptResults.Success;
}
public void Copy(string source, string target)
{
try
{
//If file exists at the target location, then delete it
if (File.Exists(target))
{
File.Delete(target);
}
//Otherwise copy it from source to target
File.Copy(source, target);
}
catch (Exception ee)
{

}
}
}
}

Monday, October 21, 2013

Restoring Database from one environment to another environment in SQL Server

As a DBA, we are required to restored database from one environment to another environment all the time.

The following steps are involved in restoring database.

Step 1. Find the mdf and ldf files to restore
Step 2. Do Actual Restore of the database
Step 3 Find orphan object for user login and fix them. ( This is important as different environment have different users --Production version test or dev environment)

Step 1.

Find the files to restore .

USE MASTER
GO

DECLARE @Path VARCHAR(128)
, @Backup VARCHAR(128)
, @Filename VARCHAR(128)
, @dbname VARCHAR(128)

-- Set Variables
SET @Path = 'C\BackupFiles' + '\' ---Path of the file..Replace with your path
SET @Backup = 'Name of the database backup files' + '.bak' -- File Name --- Replace with your
SET @filename = @path + @Backup
SET @dbname = 'Development1' ---Name of Database ---replace with your

-- Restore FileList
RESTORE FILELISTONLY
FROM DISK = @Filename

Step 2

In real environment, you may have replication going on the server where you want to restore. So we have to make sure that we kill replication.

:connect SQLTST
:setvar dbname "Development1"
:setvar delay "00:00:00.003"
:setvar max_retries "5"
:r "C\Scripts\UtilityScripts\Recovery\KillDB.sql"

USE MASTER
GO

DECLARE @Path VARCHAR(128)
, @Backup VARCHAR(128)
, @Filename VARCHAR(128)
, @dbname VARCHAR(128)

-- Set Variables
SET @Path = 'C\Backup' + '\' ---Path of the file..Replace with your path
SET @Backup = 'Name of the database backup files' + '.bak' -- File Name --- Replace with your
SET @filename = @path + @Backup
SET @dbname = 'Development1' --Which Database to restore

RESTORE DATABASE @dbname
FROM DISK = @Filename
WITH MOVE 'Development1' TO 'C:\SQL\Development1_data.mdf' ---Replace with Phyiscal Name value which was given in first step. If there are multiple mdf file
, MOVE 'Development1_log TO 'C:\SQL\Development1_log.ldf'
, STATS = 5
, replace -- May be needed if replacing existing database.

Step 3:

Fix orphan object

EXECUTE sp_change_users_login 'Report'

This will give a list of username where the objects are orphan. For example if a user 'Home' exist

Then run this

EXECUTE sys.sp_change_users_login 'Auto_Fix', 'Home'

Friday, October 11, 2013

SSIS: script task to write to file and send as email attachment

Problem:

Create a package which read a query and save the result in file and send that file as an attachment.

Solution:

The problem says that we do a select query on some database, save the result to a file and send them as attachment. Or you asked to write to file and leave it there. This file can be picked by any other process. In this case, I am going to send the file as an attachment through email.

Designing the package:

This package will consists of three task:
1. Execute SQL Task
2. Script Task
3. Send Mail Task

Variables:

Lets create all the variables needed for this package.

1. varResultSet: object type. This variable will hold the result of our execute SQL task.
2. varNewFileDelimiter: This variable is used as file delimiter which will be used in script task.
3. varUploadDirectoryPath: This variable is used to save my file created and hold values where I will be saving my file.
4.varMyResultFileName: This variable hold file name. Set Expression value to true and under expression add following.
@[User::varUploadDirectoryPath]+ "\\"+"MyResult" + "_" + Right("0" + (DT_WSTR,2)MONTH(GETDATE()), 2) + Right("0" + (DT_WSTR,2)DAY(GETDATE()),2) + (DT_WSTR,4)YEAR(GETDATE()) + ".csv"

Here I am saving my file as csv file with date added to file name.

5. varExceptionCount: This variable hold any exception counts.

Most of these variable will be used in script task and send mail task.

So lets go and add Execute SQL task to our package.

1. Execute SQL Task

Added connection manager which will be used. Right down your query in your sql statement and set ResultSet to Full Result Set.

Under Result set add the variable
Result Name = 0
Variable =varResultSet

This is all we have to do in the first task. Now if you execute this task, the result of your query will be held in this variable.

Let's add script task to consume this result and write to file.

2. Script Task

In our script task, let's chose C# language and Under Read Only add following variables.

User::varResultSet,User::varMyResultFileName,User::varNewFileDelimiter

And In ReadWriteVariable, add User::varExceptionCount

Open your script task and add following code. Make sure that if you are using different variable name, change them at your end.

using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using System.Data.OleDb;
using System.IO;
using System.Reflection;

namespace ST_7cdb67cf97304adab75a69c0b1d4688e.csproj
{
[System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{

#region VSTA generated code
enum ScriptResults
{
Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
};
#endregion

public void Main()
{
try
{
OleDbDataAdapter A = new OleDbDataAdapter();
System.Data.DataTable dt = new System.Data.DataTable();
A.Fill(dt, Dts.Variables["User::varResultSet"].Value);
string filepath = Dts.Variables["User::varMyResultFileName"].Value.ToString();

int i = 0;

Dts.Variables["User::varExceptionCount"].Value = dt.Rows.Count;

if (dt.Rows.Count > 0)

{
StreamWriter sw = null;

sw = new StreamWriter(filepath, false);

for (i = 0; i <dt.Columns.Count; i++)
{

if (i == dt.Columns.Count -1)
{
sw.Write(dt.Columns[i].ToString());
}
else
{
sw.Write(dt.Columns[i].ToString()+Dts.Variables["User::varNewFileDelimiter"].Value.ToString());
}
}
sw.WriteLine();

foreach (DataRow row in dt.Rows)
{
object[] array = row.ItemArray;

for (i = 0; i < array.Length; i++)
{
if (i == array.Length - 1)
{
sw.Write(array[i].ToString());
}
else
{
sw.Write(array[i].ToString() + Dts.Variables["User::varNewFileDelimiter"].Value.ToString());
}
}
sw.WriteLine();
}
sw.Close();

}
}
catch (Exception ex)
{
}
}
}
}

Save this. Now if you run your package, you find your file in the folder.

3. Send Mail Task

Add a send mail task, add your connection manage, add email id where you want to send. In expression, select FileAttachment and set it your variable (in this case @[User::vvarMyResultFileName]

You are all set to send this to as attachment.

Thursday, September 12, 2013

SSIS: File System Task to rename File

SSIS: File System Task to rename File

Assignment: You are give a task where you have some files--> *.csv, *.txt, or any other file. You are asked to rename the files and move to same folder or different folder. How would you do this.

Lets look at what we are asked to do.

1. We have a Source Folder where we have files of a particular type. So we need to know the location of this folder. Lets create a variable called SourceFolder of string type and save location into this folder. let's assume that this is C:\temp\SourceFolder

2. We need to rename these files and save it in same folder location or different folder location and change name of these files. So let's create another variable and call them DestinationFolder of string type and save the location in value.

3. Now we need to pull each file in SourceFolder and put in them in Destination folder. Lets create a variable called FileName of string type and leave value blank. Also create two more variables called FullSourceFolderFileName and FullDestinationFolderFileName. FullSourceFolderName value would be @SourceFolder + @FileName and FullDestinationFolderFileName value would be @DestinationFolder + @FileName. The reason for creating these two variable is to be used at run time to dynamically use them in Foreach Loop Container.

So far we have following variable in our package

VariableName type Value
1.FullName string No value --- leave this blank
2.SourceFolder string C:\temp\SourceFolder
3.DestinationFolder string C:\temp\SourceFolder or C:\temp\DestinationFolder (you can dump renamed file in same source folder or dump into destination folder)
4. FullSourceFolderName string set expression to true and in expression put @SourceFolder + @FileName
5. FullDestinationFolderFileName string @DestinationFolder + @FileName same for this as you did for variable 4.

Now let's drop Foreach Loop in our package and configure it.

Go to variable Mapping and add FileName

Now drop file system task and configure as show in figure below.

Now you are good to run the package.

Wednesday, August 28, 2013

SSIS: Insert and Update data in Oracle table from SQL Server

Recently I was asked to develop a package which would pull data from SQL Server Database and insert and update data in Oracle database.

To begin with this, we have to keep few things in mind:

First make sure that you know what data type are in your source and destination table. For example say EmpID is a integer in SQL Server database but chances are that it is NUMBER in your Oracle table to make sure to do correct data conversion task.

Secondly, in SQL Server, commit is auto, so we don't have to issue any statement (chances are that database is setup like this) however in oracle database you have to issue commit statement after you make any change to data.

For my SSIS packages, I designed my task into following steps:

1. I created a temp table in oracle database to dump all my data from sql server (using right data conversion and data type.
2. I created a Merge Stored Procedure on Oracle side which I would execute inside my SSIS package to update and insert data.

And everytime, I use DML statement, I use SQL excute task to issue commit statement on Oracle Database.

Also make sure that you use ADO.Net connection Manager to communicate with your oracle database.

To call stored procedure inside Execute SQL task use something like this.

CALL MY_Stored_Procedure ()

If you get this error message,

Warning: Multiple-step OLE DB operation generated errors.
Check each OLE DB status value, if available.
No work was done.

It means that you have missed commit statement.

Monday, August 12, 2013

How to upload documents in sql server table

How to upload documents in sql server table

Not the best solution but you still want to do it??

Lets do it.

Step 1: Create a table called documents with following columns: docID int IDENTITY, DocumentName varchar(100), Document VARBINARY(MAX), DateCreated Datetime default (getDate()));

Create table documents
(DocID int IDENTITY(1,1),
DocumentName varchar(100) NOT NULL,
Document VARBINARY(MAX),
DateCreated Datetime default (getDate())
);

Step 2:

Lets say we have a document tittle " A sql guide" and we want to store it into sql server so that incase we want to use it later for some reason.

Say this is located in C:\documents\sql\ A sql guide.doc

let's save this document in sql table:

Declare @doc As VARBINARY(MAX)

Select @doc = CAST(bulkcolumn AS VARBINARY(MAX)) FROM OPENROWSET
(BULK 'C:\documents\sql\A sql guide.doc', Single_BLOB) AS x

INSERT INTO documents (DocumentName, Document) Values ('A sql guide', @doc)

And You are done!!

Monday, July 29, 2013

MERGE SQL Statement

MERGE MyTargetTable As target
USING ( SELECT Col1, Col2, Col3
FROM MySourceTable
WHERE MySourceTable.Col1 = SomeCondition and MySourceTable.Col2= SomeCondition
GROUP BY MySourceTable.Col1)Source
ON
(
target.Col1 = Source.Col1
)
WHEN MATCHED BY target OR Col2 <> Source.Col2 OR (Col2 IS NULL AND Source.Col2 IS NULL)
THEN UPDATE
Set Col2= Source.Col2,
Col3 = Source.Col3
WHEN NOT MATCHED BY target AND Col1 = Source.Col1

THEN INSERT

(
Col1,
Col2,
Col3
)
VALUES
(
Source.Col1,
Source.Col2,
Source.Col2,
)
WHEN NOT MATCHED BY SOURCE
THEN DELETE ;

Wednesday, July 24, 2013

SSIS: How to count number of files in a folder and then perform task to insert into database.

Question: I want to count number of files in my source folder. If there are files in my source folder, I want to insert data in my table, if not I want to send an email alert that no file was found (This is just the beginning to complicate this requirement but let's begin with this simple task and we will go to make this more complicated later)

Solution:

Step 1: Let's create some variables and use them for this task.

varSourceFolder string C:\tmp
FileCount int 0

Step 2: Open a script task in Control Flow and rename it to "Check for Files" and copy and paste this script.

'****************************************************************************************************************
' Microsoft SQL Server Integration Services Script Task
' Write scripts using Microsoft Visual Basic 2008.
' The ScriptMain is the entry point class of the script.

Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime
'Add these to system to your script
Imports System.IO
Imports System.IO.Directory

<System.AddIn.AddIn("ScriptMain", Version:="1.0", Publisher:="", Description:="")> _
<System.CLSCompliantAttribute(False)> _
Partial Public Class ScriptMain
Inherits Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase

'Declare FileCount as Global Variable where we will store value of number of files in our source folder
Dim FileCount As Int32
Enum ScriptResults
Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success
Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
End Enum

' The execution engine calls this method when the task executes.
' To access the object model, use the Dts property. Connections, variables, events,
' and logging features are available as members of the Dts property as shown in the following examples.
'
' To reference a variable, call Dts.Variables("MyCaseSensitiveVariableName").Value
' To post a log entry, call Dts.Log("This is my log text", 999, Nothing)
' To fire an event, call Dts.Events.FireInformation(99, "test", "hit the help message", "", 0, True)
'
' To use the connections collection use something like the following:
' ConnectionManager cm = Dts.Connections.Add("OLEDB")
' cm.ConnectionString = "Data Source=localhost;Initial Catalog=AdventureWorks;Provider=SQLNCLI10;Integrated Security=SSPI;Auto Translate=False;"
'
' Before returning from this method, set the value of Dts.TaskResult to indicate success or failure.
'
' To open Help, press F1.

Public Sub Main()
'
' Add your code here
'
'foldername is declared as string variable and it look in varSourceFolder place to find all our files.
Dim foldername As String = Dts.Variables("User::varSourceFolder").Value
' dirs get all the files based on .csv file type in our varSourceFolder location.
Dim dirs As String() = System.IO.Directory.GetFiles(foldername, "*.csv")
'This count all the file and save it fileCount variable which we declared at global label.
FileCount = dirs.Length
'MessageBox.Show(FileCount) show number of file in our folder. This is done during development. Once you are done with this, you can comment out or remove for your script.
MessageBox.Show(FileCount)
' Finally we write the FileCount to our variable which we can access outside this script task.
Dts.Variables("User::FileCount").Value = FileCount

Dts.TaskResult = ScriptResults.Success
End Sub

End Class
'****************************************************************************************************************

Step 3: Send a mail if there is no file in our source folder.

Drag a send mail task and connect to scrip task.
Double click connector to edit it.
Under Constraint option, choose expression as Evalution operation.
Then under Expression write
@FileCount ==0 and test the connection. Remember this is the out variable from script task where we are saving file count.
Configure your send mail according to your smtp server and email from and to and message body.

Step 4: Drag and drop Foreach Loop container to your control flow. Configure your Foreach Loop. ( make sure you use expression for directory and file spec!!! it save life later on)

Step 5. Drag and Drop data flow task inside Foreach Loop and configure it according to your need.

Happy Hunting in SSIS.

Wednesday, July 10, 2013

Finding duplicate rows of data in a table

Many time, as we develop Stored Procedures or Inserting, Updating or Deleting data from a table, we some how come across this question. Does my Merge Statement inserting duplicating row (you never now if one of column is NULL) based on your condition (it has happened few times to me!!!)

You will come across this SQL Statement which will give you a duplicate row in a table if it exists.

SELECT Column_Name, COUNT(*) TotalCountofRow
FROM Table_Name
GROUP BY Column_Name
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC;

Let's go little in depth on this statement.

SELECT COUNT(*) TotalCount
FROM Table_Name

What this statement return is number of total row in a table. Let's say we have some identity column in this table (ID) and few other unique columns.

Now what if we are inserting data in this table using a stored procedure where we have set INSERT_IDENTITY Table_Name ON and INSERT_IDENTITY Table_Name OFF.

And we are not aware of this situation.

In this situation duplicate data may enter our given table. How do we find out?

Let's assume that we have ID, and JobID as distinct values. And we want to see if there is any data have these two column with duplicate row of data.

SELECT JobID, COUNT(*) TotalCountofRow
FROM Table_Name
GROUP BY JobID
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC;

If duplicate row exists, TotalCountofRow will be not 0.

Wednesday, June 26, 2013

How to find Columns name in a sql table

SELECT
c.name 'Column Name',
t.Name 'Data type',
c.max_length 'Max Length',
c.precision ,
c.scale ,
c.is_nullable,
ISNULL(i.is_primary_key, 0) 'Primary Key'
FROM
sys.columns c
INNER JOIN
sys.types t ON c.system_type_id = t.system_type_id
LEFT OUTER JOIN
sys.index_columns ic ON ic.object_id = c.object_id AND ic.column_id = c.column_id
LEFT OUTER JOIN
sys.indexes i ON ic.object_id = i.object_id AND ic.index_id = i.index_id
WHERE
c.object_id = OBJECT_ID('TableName')

Note: If you are using nvarchar as datatype, it will give you another column with same name but datatype of sysname. Also all the Max length for nvarchar datatype will be shown as double of original value.

Example:

Let's create a table and see how it work with nvarchar and varchar datatype to show it.

CREATE Table ContactAddress
(ContactID int IDENTITY(1,1) Primary Key,
FullName varchar(50),
Address1 varchar(40),
Address2 varchar(40),
City varchar(20),
Zip varchar(5),
[State] Char(2),
Country Varchar(30)
)

--And Create another table with similar structure but this time change zip data type from varchar to nvarchar

CREATE Table ContactAddress1
(ContactID int IDENTITY(1,1) Primary Key,
FullName varchar(50),
Address1 varchar(40),
Address2 varchar(40),
City varchar(20),
Zip nvarchar(5),
[State] Char(2),
Country Varchar(30)
)

And here's the result after you run the query:

You will also notice that their is another column "Zip" with sysname datatype.

Thursday, May 23, 2013

COALESCE Function in TSQL

If you are new to TSQL or SQL in general, it’s always good idea to read about functions which we rarely use. Someday it might come handy (Who know it’s a tricky question and people want to trip you down memory lane.)

So let’s try to understand COALESCE function.

If you are not a big fan of msdn site, I don't blame you!!!! However it will give you a great starting point to do your research.

According to MSDN (http://msdn.microsoft.com/en-us/library/ms190349.aspx)

COALESCE says that it will "Returns the first nonnull expression among its arguments."

Let’s go through some example:

Suppose we have a table called Address with three columns (Home Number, Work Number and Cell Number) along with all other columns. Also suppose that none of these column (phone number columns) are NOT NULL (meaning it’s up to user to enter information if he or she likes it). So we can have some users who entered phone number under Home and some under Work and still few under cell number and same never bothered to enter any number.

Let’s assume that we want to see this information in our select output along with other information. Our preference choice is Home number first, Work number second and cell number third or it can be cell number first, home number second and work number third. Also we just want to show only 1 number at any given time. So how do we do this in our TSQL.

Here’s come-à COALESCE

Let’s go with our first choice (home then work then cell)

Select COALESCE (Home Number, Work Number, Cell Number) AS Phone Number from Address

This statement will look at home number first, if there is home number, it will return that number. If home number is missing, it will jump to work number. If work number is present it will return as output otherwise it will jump to cell number. So our result will have one number from these three columns.

COALESCE (444-444-4444, 555-555-5555, 666-666-6666) will give you-à 444-444-4444

COALESCE (NULL, 555-555-5555, 666-666-6666) will give you à 555-555-5555

COALESCE (NULL, NULL, 666-666-6666) will give you à 666-666-6666

More generic example:

COALESCE(1, 2, 3, 4, 5, NULL)à 1

COALESCE(NULL, NULL, NULL, 1, 2, 3) à 1

COALESCE(9, 8, 7, 6, 5, NULL)à 9

COALESCE(NULL, NULL, NULL, 4, 5, NULL) à 4

What happen if none of the three columns have any number? In that case it will return a NULL value.

COALESCE(NULL, NULL, NULL, NULL) à NULL

Friday, May 10, 2013

Creating Constraint on Table to check for data integrity

Creating Constraint on Table to check for duplicate values

Sometime we are asked to make sure that there is no duplicate data in our table based on certain columns combination. There are many ways to do this.

The first thing that will come to our mind is TRIGGER.

Well Trigger are good but its not always best solution, particularly in our case here.

So let see how we can do it.

Create a table and insert some data.

CREATE TABLE [dbo].[Product](
[ProductID] [int],
[ProductName] varchar(100) NOT NULL,
[IsActive] Bit NOT NULL,
[RetiredDate] Datetime2,
[MinPrice] Float,
[MaxPrice] Float
)

Just to show how constraint works, I am not adding any Primary Key or Foreign Key relation with other tables.

Let's insert some data in this table.

Insert INTO Product ([ProductID],[ProductName] ,[IsActive],[MinPrice] ,[MaxPrice])
VALUES (101, 'Mountain-Bike', 1, 100.00, 500.00), (102, 'Mountain-rugged',1,299.99,999.99), (103, 'Mountain', 0, 199.99,399.99)

Check to make sure you have data in table you just inserted.

SELECT * FROM Product

Now let's say that our ProductID + ProductName+ IsActive has to be unique row in our table. So in this case based on this, there can be only two combination (0 or 1 for active and inactive values)

Suppose we do not want user to enter a value of above combination because we are considering this to be a unique values.

"101--Mountain-Bike--1"

Right now with no constraint, we will be able to enter this value.

Insert INTO Product ([ProductID],[ProductName] ,[IsActive],[MinPrice] ,[MaxPrice])
VALUES (101, 'Mountain-Bike', 1, 59.99, 199.00)

Here, I have just changed MinPrice and MaxPrice but productID, productName and IsActive are same, which should not be allowed if we want to maintain data consistency.

To prevent these type of data, we can create trigger based on unique combination of three columns where we want to maintain data integrity.

To do so, we have to remove last inserted row from our table. Let's go and remove that row.

Method 1:

ALTER TABLE dbo.Product
ADD CONSTRAINT unique_PID_PNAME_Active UNIQUE([ProductID],[ProductName] ,[IsActive])

Method 2:

CREATE UNIQUE INDEX uq_product
ON dbo.Product([ProductID],[ProductName] ,[IsActive]);

Method 3: which is trigger (not recommended but you can still do it)

CREATE TRIGGER dbo.BlockDuplicatesproduct ON dbo.product

INSTEAD OF INSERT

BEGIN

SET NOCOUNT ON;

IF NOT EXISTS (

SELECT 1

FROM inserted AS i

INNER JOIN dbo.product AS t ON i.productID = t.productID

AND i.[ProductName] = t.[ProductName]

)

BEGIN

INSERT dbo.product (

[ProductID]

,[ProductName]

,[IsActive]

,[RetiredDate]

,[MinPrice]

,[MaxPrice]

)

SELECT [ProductID]

,[ProductName]

,[IsActive]

,[RetiredDate]

,[MinPrice]

,[MaxPrice]

FROM inserted;

END

ELSE

BEGIN

PRINT N'';

END

GO;

Now test and see it works for you or not.

Tuesday, May 7, 2013

Inserting Data in temp table from Another table

Sometime when we update certain table, its always a good idea to save the entire data into another temporary table which will hold unless we delete that temp table.

Let's say we are working on a Employees table and we need to manually update certain rows or columns for this table.

To do so, let's back them up somewhere for our reference (incase we mess up whole table..I have done few times in beginning!!)

Select * INTO tmpEmployees
FROM Employees

This statement will create a table "tmpEmployees" which will have same structure as Employees table.

Even we when we close current session, this table will remain there in our database.

However, if you do not want to retain your temp table after you are done with you update, you can create #tmp table

Select * INTO #tmpEmployees
FROM EMPLOYEES

If table already exist, than we have to use this

INSERT INTO #tmpEmployees
Select * FROM Employees

:Kumar

Friday, April 19, 2013

How to pass a value or no value inside a T-SQL or stored Procedure

How to pass a value or no value inside a T-SQL or stored Procedure

Let's say we have a table called state which hold three column (StateId, StateCode and StateName). If you don't have this table in your test database, go ahead and make one.

CREATE TABLE [dbo].[State]
(
StateCodeID int IDENTITY(1,1),
StateCode varchar(2) NOT NULL,
[State] varchar(25) NOT NULL
);

Let's insert some data in this table. You can copy these script and insert into your table.

Insert INTO State VALUES
('AK', 'Alaska')
,('AL', 'Alabama')
,('AR', 'Arkansas')
,('AZ', 'Arizona')
,('CA', 'California')
,('CO', 'Colorado')
,('CT', 'Connecticut')
,('DC', 'Dist. of Columbia')
,('DE', 'Delaware')
,('FL', 'Florida')
,('GA', 'Georgia')
,('HI', 'Hawaii')
,('IA', 'Iowa')
,('ID', 'Idaho')
,('IL', 'Illinois')
,('IN', 'Indiana')
,('KS', 'Kansas')
,('KY', 'Kentucky')
,('LA', 'Louisiana')
,('MA', 'Massachusetts')
,('MD', 'Maryland')
,('ME', 'Maine')
,('MI', 'Michigan')
,('MN', 'Minnesota')
,('MO', 'Missouri')
,('MS', 'Mississippi')
,('MT', 'Montana')
,('NC', 'North Carolina')
,('ND', 'North Dakota')
,('NE', 'Nebraska')
,('NH', 'New Hampshire')
,('NJ', 'New Jersey')
,('NM', 'New Mexico')
,('NV', 'Nevada')
,('NY', 'New York')
,('OH', 'Ohio')
,('OK', 'Oklahoma')
,('OR', 'Oregon')
,('PA', 'Pennsylvania')
,('RI', 'Rhode Island')
,('SC', 'South Carolina')
,('SD', 'South Dakota')
,('TN', 'Tennessee')
,('TX', 'Texas')
,('UT', 'Utah')
,('VA', 'Virginia')
,('VT', 'Vermont')
,('WA', 'Washington')
,('WI', 'Wisconsin')
,('WV', 'West Virginia')
,('WY', 'Wyoming');

So this script should enter 51 rows in our table.

Coming to the point, lets say someone ask you write a stored procedure where he need stateCodeID and State value if user enter statecode, if not he want to see all the values.

Let's go ahead write this script

CREATE PROCEDURE [dbo].[usp_GetStateInformation]
(
@Statecode VARCHAR(2) = NULL
)
AS
Begin
SET NOCOUNT ON
SET ANSI_WARNINGS OFF

SELECT STATECODE
,[STATE]
FROM [STATE]
WHERE STATECODE = ISNULL(@Statecode, StateCode)
ORDER BY StateCode
END;

"ISNULL(@Statecode, StateCode)" this is important function

What ISNULL do is that if there is no value in our variable @StateCode, it will pass null value in select query and it will return all the rows.

Now try executing this stored procedure as

Exec usp_GetStateInformation 'TX'

In this you will get statecode and state value only for 'TX'

And like this

Exec usp_GetStateInformation

In this you will get all the statecode and state value for all rows (51)

SQL: Intersect, Except and Union

Let's say we have Table A with ID Column and its values are (1,2,3,4,5) and Table B with its ID Column and values (3,4,5,6,7).

Let's say we want to write different queries giving us something like this:
Find me all the value which common in both table? (3,4,5)

Find me all unique value in both table? (1,2,3,4,5,6,7)

Find me all values which are unique in table A and that are also not in Table B (1,2)

Find me all values which are unique in both Table (1,2,6,7)

Fire your SQL Server and let's create these table and insert data in them and see how it work.

CREATE Table TableA
(ID int);
GO
INSERT INTO TableA VALUES (1),(2),(3),(4),(5);
GO
CREATE Table TableB
(ID int);
GO
INSERT INTO TableB VALUES (3),(4),(5),(6),(7);

Find me all the value which common in both table? (3,4,5)

To find values in both table, take a look at the picture above. The question ask: what are the values which are common to both table? What word comes to your mind? Intersect? right.. there is keyword "INTERSECT" in sql language.

Try this

SELECT * FROM TableA
INTERSECT
SELECT * FROM TableB;

Result

Find me all unique value in both table? (1,2,3,4,5,6,7)

To find value which are unique in both table, take a look at picture above again. think about it. programming and database is all about logic.

Select * from Table A

Union

Select * from Table B

Result

If you do UNION ALL, we will get repeat value of common number in both table like this.

Find me all values which are unique in table A and that are also not in Table B (1,2)

In this case we have to use EXCEPT keyword.

SELECT * FROM TableA

EXCEPT

SELECT * FROM TableB;

Result

Find me all values which are unique in both Table (1,2,6,7)

For this query we have to really think hard.. think about UNION and EXCEPT combining somehow?

SELECT * FROM TableA

Union

Select * FROM TableB

EXCEPT

SELECT * FROM TableA

INTERSECT

SELECT * FROM TableB;

Result

Hope this help you.

Few things to remember about these queries are:

The basic rules for combining the result sets of two queries that use EXCEPT or INTERSECT are the following:

1. The number and the order of the columns must be the same in all queries.

2. The data types must be compatible.

IF you don't follow this rule, will will get error message.

Cheers!!!

Thursday, April 18, 2013

SubQuery and JOIN: A beginner concept

Microsoft define subquery as "A subquery is a query that is nested inside a SELECT, INSERT, UPDATE, or DELETE statement, or inside another subquery. A subquery can be used anywhere an expression is allowed. "

Let's look at an example of this subquery and how can we avoid subquery all together by using JOIN functionality.

For this example I am going to use AdventureWork database.

We are going to look at some sales table and show you how you can avoid subquery by using JOIN.

Lets say we want to look at SalesOrderID, SalesOrderDate and Maximum UnitPrice from our sales table
(SalesOrderDetail and SalesOrderHeader ).

So first write down what we want.

--Select SalesOrderID, SalesOrderDate and Maximum UnitPrice from Table SalesOrderDetail and Table SalesOrderHeader

So let's go ahead and write on query using subquery

SELECT Ord.SalesOrderID
,Ord.OrderDate
,(
SELECT max(OrdDet.UnitPrice)
FROM AdventureWorks.Sales.SalesOrderDetail OrdDet
WHERE Ord.SalesOrderID = OrdDet.SalesOrderID
) AS MaxUnitPrice
FROM AdventureWorks.Sales.SalesOrderHeader Ord;

Now let's try writing down similar query which will give the same result.

SELECT Ord.SalesOrderID
,Ord.OrderDate
,max(OrdDet.UnitPrice) AS MaxUnitPrice
FROM AdventureWorks.Sales.SalesOrderHeader Ord
JOIN AdventureWorks.Sales.SalesOrderDetail OrdDet ON Ord.SalesOrderID = OrdDet.SalesOrderID
GROUP BY Ord.SalesOrderID
,Ord.OrderDate;

If possible try avoiding use of subquery in your T-SQL.

Here is execution plan for the 2 queries.

Monday, April 15, 2013

SQL Error while Inserting Image...one of those error

When we try to insert image into a sql table, sometime we get error like this.

Msg 491, Level 16, State 1, Line 4
A correlation name must be specified for the bulk rowset in the from clause.

So let's look at a simple SQL statement.

Suppose we have a simple table consisting of ImageID, ImageName, and Image itself column.

Let's create a simple table and we will go through it.

Create Table ImageTable
(ImageID int IDENTITY(1,1),
ImageName varchar(100),
Imagefile IMAGE
)

Let's make sure that we have right table.

SELECT * FROM ImageTable

Now that we have table ready to store or save image, let go ahead and write a sql statement

Insert INTO dbo.ImageTable
(ImageName,Imagefile)
SELECT 'MyPhoto',* FROM OPENROWSET(BULK N'C:\Data\HeaderImage\Header_A.png', SINGLE_BLOB)

You were expecting that this should have insert a row in our table with the image..!!! Right but you got the error.

So before we look into this, let's give a alias name to our sql statement.

Insert INTO dbo.ImageTable
(ImageName,Imagefile)
SELECT 'MyPhoto',* FROM OPENROWSET(BULK N'C:\Data\HeaderImage\Header_A.png', SINGLE_BLOB) A;

Now try to run this script.

It worked!!! Right.

Well lets now try to understand why we got the error in the first place.

Openrowset is looking for an alias in our first sql statement and it cannot find one, so it is throwing an error message "A correlation name must be specified for the bulk rowset in the from clause."

In our case, generally speaking its a good practice to give alias to all our tables. Like

Select * from Customer C;
Select C.CustomerID, C.CustomerFirstName, etc From Customer C.

So here all we have to do is give an alias to openrowset 'A'

:)

Kumar