MS SQL Server: Online Indexing in SQL Server 2005

Online Indexing is a new feature available in SQL Server 2005. In SQL Server 2005, DBAs can create, rebuild, or drop indexes online. The index operations on the underlying table can be performed concurrently with update or query operations. This was not possible in previous versions of SQL Server. In the past, indexing operations (reorganizing or rebuilding) were usually performed as a part of other maintenance tasks running during off-peak hours in SQL Server 2000 or 7.0. During these offline operations, the indexing operations hold exclusive locks on the underlying table and associated indexes. During online index operations, SQL Server 2005 eliminates the need of exclusive locks (This is explained in the "How Online Index Operations work" section, found later in this article.)

The Online indexing feature is very helpful for environments that run 24 hours a day, seven days a week. The Online Indexing feature is available only in the Enterprise Edition of SQL Server 2005.

How Online Index Operations Work?

Now let's have a look at how indexing operations work in SQL Server 2005. During online index operations, several different structures are used. They are the source, preexisting indexes, target, and temporary mapping indexes.

The source refers to the underlying table or clustered index data. Preexisting indexes are nonclustered indexes associated with the table or the clustered index. Preexisting indexes are available for users for concurrent DML operations. The target structure is the new index that is created or rebuilt. The temporary mapping index comes into the picture during the creation of the clustered index. This nonclustered index is created in the same step as the new clustered index (or heap) and does not require a separate sort operation. Concurrent transactions also maintain the temporary mapping index in all INSERT, UPDATE, and DELETE operations.

The online index operation can be divided into three phases:

Preparation
Build
Final

The Build phase is a longest phase of all. It is in this phase where the creation, dropping, or rebuilding of indexes take place. The duration of the Build phase depends on the size of the data and the speed of the hardware. Exclusive locks are not held in this phase, so concurrent DML operations can be performed during this phase. The Preparation and Final phases are for shorter durations. They are independent of the size factor of the data. During these two short phases, the table or the indexed data is not available for concurrent DML operations. Now let's discuss these three phases in detail.

Preparation Phase

During the preparation phase, the following activities take place:

A snapshot of the source table is acquired. When taking a snapshot of the table, a shared lock is obtained on the table. This is done for transactional level read consistency. An Intent share lock is also obtained. This lock is maintained until the final phase. In addition, a resource lock INDEX_BUILD_INTERNAL_RESOURCE is acquired. This prevents the execution of concurrent DDL operations on the source and preexisting structures while the index operation is in progress. For example, this lock prevents the concurrent rebuild of two indexes on the same table at the same time.
Metadata is created. The logical metadata consists of Index ID, Index name, keys, and attributes. This is held in memory until the final phase. When an index is rebuilt online, two versions of the same index with the same name and ID exist during the rebuild operation—the original index and the new "in-build" index. The new index remains in an "in-build" state until the final phase of the online index operation. Concurrent DML operations are not allowed during this phase.

Build Phase

During the build phase:

The new index structure created in the preparation phase is populated with the sorted data from existing data source.
The shared lock acquired in the preparation phase is released. Concurrent DML operations can be performed during the Build phase.
If the clustered index is created or rebuilt online, a temporary nonclustered index, called the mapping index, is created in the build phase.
The mapping index is used by concurrent DML transactions to determine which records to delete in the new indexes that are being built when rows in the underlying table are updated or deleted. The mapping index is discarded when the index build process commits. The mapping index is not created if the clustered index is rebuilt or created offline. The mapping index also does not come into picture if a nonclustered index is being rebuilt or created.

When a clustered index is created or dropped and non-clustered indexes exist for that table, the preparation and build phases are repeated twice; once for the new clustered index and again for the nonclustered indexes, which are all rebuilt in a single operation to incorporate the change in the definition of the base structure of the clustered index or heap.

Final Phase

The Final phase is used to inform the system that the index is ready. These are the following activities that take place during this phase:

The IS lock is upgraded to a Schema Modify (Sch-M) or Shared(S) lock depending on the type of index operation. A shared lock is acquired when a new nonclustered index is created. If a Sch-M lock is obtained, conflicting index operations are prevented from starting by using the special index-operation lock resource INDEX_BUILD_INTERNAL_RESOURCE.
If a clustered index is created, then all the nonclustered indexes are rebuilt. Due to the locks held on the table, concurrent DML operations are not allowed on the table at this time.
The Sch-M or S lock is released once the index is in the ready state.

Transact SQL

Online indexing can be performed through the following T-SQL statements.

Create Index

The syntax of Create Index is:
CREATE [ UNIQUE ] [ CLUSTERED | NONCLUSTERED ] INDEX index_name
ON
( column [ ASC | DESC ] [ ,...n ] )
[ INCLUDE ( column_name [ ,...n ] ) ]
[ WITH (
[ ,...n ] ) ]
[ ON { partition_scheme_name ( column_name )
| filegroup_name
| default
}
]
[ ; ]
::=
{
[ database_name. [ schema_name ] . | schema_name. ]
table_or_view_name
}

::=
{
PAD_INDEX = { ON | OFF }
| FILLFACTOR = fillfactor
| SORT_IN_TEMPDB = { ON | OFF }
| IGNORE_DUP_KEY = { ON | OFF }
| STATISTICS_NORECOMPUTE = { ON | OFF }
| DROP_EXISTING = { ON | OFF }
| ONLINE = { ON | OFF }
| ALLOW_ROW_LOCKS = { ON | OFF }
| ALLOW_PAGE_LOCKS = { ON | OFF }
| MAXDOP = max_degree_of_parallelism
}

When we specify Online = ON, the clustered index is created online.

Example:

USE [AdventureWorks]
GO
CREATE CLUSTERED INDEX [PK_Employee_EmployeeID] ON [HumanResources].[Employee] ([EmployeeID] ASC)
WITH (PAD_INDEX = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = ON) ON [PRIMARY]
GO
Alter Index

Syntax of Alter Index

ALTER INDEX { index_name | ALL }
ON

MS SQL Server

Subscribe

Powered By

Monday, September 29, 2008

Online Indexing in SQL Server 2005

How Online Index Operations Work?

Preparation Phase

Build Phase

Final Phase

Transact SQL

Disk Space Considerations

Performance Considerations

Best Practices

Conclusion

1 comment:

Recent Posts

Archives

Followers

Recent Comments