A clustered index determines the order in which the rows of a table are stored on disk. If a table has a clustered index, then the rows of that table will be stored on disk in the same exact order as the clustered index. An example will help clarify what we mean by that.
An example of a clustered index
Suppose we have a table called Employee which has a column called EmployeeID. Let’s say we create a clustered index on the EmployeeID column. Then, what happens is that the rows inside the Employee table will be physically sorted (on the disk) by the EmployeeID values. What does this accomplish? Well, it means that whenever a lookup/search for a specific EmployeeID is done using that clustered index, then the lookup will go directly to that row in the table – that is the . This is because the rows in the table are sorted in the exact same order as the clustered index, and more importantly, the actual table data is stored in the leaf nodes of the clustered index. Remember that an index is usually a tree data structure – and leaf nodes are the nodes that are at the very bottom of that tree. In other words, a clustered index basically contains the actual table level data in the index itself – this is very different from most other types of indexes as you can read about below.
A comparison of a non-clustered index with a clustered index with an example
As an example of a non-clustered index, let’s say that we have a non-clustered index on the EmployeeID column. A non-clustered index will store both the value of the EmployeeID AND a pointer to the row in the Employee table where that value is actually stored. But a clustered index, on the other hand, will actually store the row data for a particular EmployeeID – so if you are running a query that looks for an EmployeeID of 15, the data from other columns in the table like EmployeeName, EmployeeAddress, etc. will all actually be stored in the leaf node of the clustered index itself.
This means that with a non-clustered index extra work is required to follow that pointer to the row in the table to retrieve any other desired values, as opposed to a clustered index which can just access the row directly since it is being stored in the same order as the clustered index itself. So, reading from a clustered index is generally faster than reading from a non-clustered index.
A table can have multiple non-clustered indexes
A table can have multiple non-clustered indexes because they don’t affect the order in which the rows are stored on disk like clustered indexes.
Why can a table have only one clustered index?
Because a clustered index determines the order in which the rows will be stored on disk, having more than one clustered index on one table is impossible. Imagine if we have two clustered indexes on a single table – which index would determine the order in which the rows will be stored? Since the rows of a table can only be sorted to follow just one index, having more than one clustered index is not allowed.
Summary of the differences between clustered and non-clustered indexes
Clustered index will physically order the data based on the values in the index.
A non clustered index is a special type of index that stores the records in logical order rather than physical order. This means, a table can have many non clustered indexes. More non clustered indexes per table means more time it takes to write new records.
A clustered index is a special type of index that reorders the way records in the table are physically stored. Therefore table can have only one clustered index. The leaf nodes of a clustered index contain the data pages.
A non clustered index is a special type of index in which the logical order of the index does not match the physical stored order of the rows on disk. The leaf node of a non clustered index does not consist of the data pages. Instead, the leaf nodes contain index rows.
Clustered index is physically stored a table can have 1 clustered index.
Non clustered index is logically stored a table can have multiple non clustered index.
Non clustered indexes store both a value and a pointer to the actual row that holds that value. Clustered indexes don’t need to store a pointer to the actual row because of the fact that the rows in the table are stored on disk in the same exact order as the clustered index – and the non-clustered index actually stores the row-level data in it’s leaf nodes.
No comments:
Post a Comment