Definition of a heap
"an untidy collection of things piled up haphazardly"
So when we hear the word heap table at an interview, generally we say "A heap is a table without a clustered index. One or more nonclustered indexes can be created on tables stored as a heap. Data is stored in the heap without specifying an order."
To understand above statement, we have to fully understand concept on nonclustered index and how it work with a clustered index and without a clustered index against a table.
Some of you might ask then what is point of creating a nonclustered index on a table without a clustered index?
To understand this, let's go through what Microsoft says about Clustered and NonClustered index
- Clustered
- Clustered indexes sort and store the data rows in the table or view based on their key values. These are the columns included in the index definition. There can be only one clustered index per table, because the data rows themselves can be sorted in only one order.
- The only time the data rows in a table are stored in sorted order is when the table contains a clustered index. When a table has a clustered index, the table is called a clustered table. If a table has no clustered index, its data rows are stored in an unordered structure called a heap.
- Nonclustered
- Nonclustered indexes have a structure separate from the data rows. A nonclustered index contains the nonclustered index key values and each key value entry has a pointer to the data row that contains the key value.
- The pointer from an index row in a nonclustered index to a data row is called a row locator. The structure of the row locator depends on whether the data pages are stored in a heap or a clustered table. For a heap, a row locator is a pointer to the row. For a clustered table, the row locator is the clustered index key.
These are important point to remember from above reference
- For a clustered table, the row locator is the clustered index key
- The pointer from an index row in a nonclustered index to a data row is called a row locator
- For a heap, a row locator is a pointer to the row
So the question is how many row will a query scan before finding the relevant record(s)? These index contains keys built from one or more columns in the table or view. These keys are stored in a structure (B-tree) that enables SQL Server to find the row or rows associated with the key values quickly and efficiently.
Coming back to heap table, we can define a table is a heap which does not have any order of data stored in it.