Wednesday, January 6, 2016

What is a Heap Table in SQL?


Definition of a heap

 "an untidy collection of things piled up haphazardly"

So when we hear the word heap table at an interview, generally we say "A heap is a table without a clustered index. One or more nonclustered indexes can be created on tables stored as a heap. Data is stored in the heap without specifying an order."

To understand above statement, we have to fully understand concept on nonclustered index and how it work with a clustered index and without a clustered index against a table.

Some of you might ask then what is point of creating a nonclustered index on a table without a clustered index?

To understand this, let's go through what Microsoft says about Clustered and NonClustered index

  • Clustered
    • Clustered indexes sort and store the data rows in the table or view based on their key values. These are the columns included in the index definition. There can be only one clustered index per table, because the data rows themselves can be sorted in only one order.
    • The only time the data rows in a table are stored in sorted order is when the table contains a clustered index. When a table has a clustered index, the table is called a clustered table. If a table has no clustered index, its data rows are stored in an unordered structure called a heap.
  • Nonclustered
    • Nonclustered indexes have a structure separate from the data rows. A nonclustered index contains the nonclustered index key values and each key value entry has a pointer to the data row that contains the key value.
    • The pointer from an index row in a nonclustered index to a data row is called a row locator. The structure of the row locator depends on whether the data pages are stored in a heap or a clustered table. For a heap, a row locator is a pointer to the row. For a clustered table, the row locator is the clustered index key. 
Source: Microsoft MSDN link 

These are important point to remember from above reference


  • For a clustered table, the row locator is the clustered index key
  • The pointer from an index row in a nonclustered index to a data row is called a row locator
  • For a heap, a row locator is a pointer to the row

So the question is how many row will a query scan before finding the relevant record(s)? These index contains keys built from one or more columns in the table or view. These keys are stored in a structure (B-tree) that enables SQL Server to find the row or rows associated with the key values quickly and efficiently.

Coming back to heap table, we can define a table is a heap which does not have any order of data stored in it.


How to get first and last date of current year with day name

SELECT
StartOfYear = Convert(Varchar,DATEADD(yy, DATEDIFF(yy,0,getdate()), 0),101)
,CalendarFirstDayName = datename(dw,DATEADD(yy, DATEDIFF(yy,0,getdate()), 0))
,LastDayOfYear = Convert(Varchar,DATEADD(yy, DATEDIFF(yy,0,getdate()) + 1, -1),101)
,CalendarLastDayName = datename(DW,DATEADD(yy, DATEDIFF(yy,0,getdate()) + 1, -1))



Creating a Perfect Calendar table in SQL



 ;WITH CTE_DatesTable
AS
(
  SELECT CAST('20000101' as date) AS [date] --- Change this to your beginning time
  UNION ALL
  SELECT   DATEADD(dd, 1, [date])
  FROM CTE_DatesTable
  WHERE DATEADD(dd, 1, [date]) <= '20041231' --- Change this to your end time
)
Select
DateKey = substring(Cast([date] as varchar(10)),1,4) +substring(Cast([date] as varchar(10)),6,2)+substring(Cast([date] as varchar(10)),9,2)
,CalendarDate = [Date]
,[date] AS CalendarYearMonthDate
,CalendarYearYYYY = substring(Cast([date] as varchar(10)),1,4)
,CalendarYearMonthYYYYMM = substring(Cast([date] as varchar(10)),1,4) +substring(Cast([date] as varchar(10)),6,2)
,CalendarYearDDMMYYYY =substring(Cast([date] as varchar(10)),9,2)+'/'+substring(Cast([date] as varchar(10)),6,2)+'/'+substring(Cast([date] as varchar(10)),1,4)
,CalendarYearMMDDYYY =substring(Cast([date] as varchar(10)),6,2)+'/'+substring(Cast([date] as varchar(10)),9,2)+'/'+substring(Cast([date] as varchar(10)),1,4)
,CalendarDayCode = Upper(substring(datename(dw,[Date]),1,3))
,CalendarDayName = datename(dw,[Date])
,CalendarYearWeekCode =Datename(yy,[Date])+ RIGHT(DateName(wk,[Date]),2)
,CalendarYearWeekNumber = RIGHT(DateName(wk,[Date]),2)
,CalendarYearMonthName = Datename(MM, [date])
,CalendarYearMonthCode = substring(cast(Datename(MM, [date]) as Varchar(10)),1,3)
,CalendarMonthYearCode = substring(cast(Datename(MM, [date]) as Varchar(10)),1,3)+'-'+substring(Cast([date] as varchar(10)),3,2)
,[CalendarQuarter]='Q'+ Cast(DATENAME(quarter, [date]) AS Varchar(1))
,CalendarQuarterYearCode = 'Q'+ Cast(DATENAME(quarter, [date]) AS Varchar(1))+'-'+substring(Cast([date] as varchar(10)),3,2)
,CalendarYearQuarterCode = substring(Cast([date] as varchar(10)),1,4)+'Q'+ Cast(DATENAME(quarter, [date]) AS Varchar(1))
,CalendarYearIsLeapYear = CASE WHEN (cast(year([date]) as int) % 4 = 0 AND cast(year([date]) as int) % 100 <> 0) OR cast(year([date]) as int) % 400 = 0 THEN 'Y' ELSE 'N' END
,FiscalYear = cast(case when month([date]) > 10 then year([date]) else year([date])-1 end as varchar(4))--Depending when your fiscal year begin- in this it begin on 11/1/YYYY




from CTE_DatesTable
OPTION (MAXRECURSION 0);