Subscribe

RSS Feed (xml)

Powered By

Skin Design:
Free Blogger Skins

Powered by Blogger


Monday, September 29, 2008

Online Indexing in SQL Server 2005

Online Indexing is a new feature available in SQL Server 2005. In SQL Server 2005, DBAs can create, rebuild, or drop indexes online. The index operations on the underlying table can be performed concurrently with update or query operations. This was not possible in previous versions of SQL Server. In the past, indexing operations (reorganizing or rebuilding) were usually performed as a part of other maintenance tasks running during off-peak hours in SQL Server 2000 or 7.0. During these offline operations, the indexing operations hold exclusive locks on the underlying table and associated indexes. During online index operations, SQL Server 2005 eliminates the need of exclusive locks (This is explained in the "How Online Index Operations work" section, found later in this article.)

The Online indexing feature is very helpful for environments that run 24 hours a day, seven days a week. The Online Indexing feature is available only in the Enterprise Edition of SQL Server 2005.



How Online Index Operations Work?

Now let's have a look at how indexing operations work in SQL Server 2005. During online index operations, several different structures are used. They are the source, preexisting indexes, target, and temporary mapping indexes.

The source refers to the underlying table or clustered index data. Preexisting indexes are nonclustered indexes associated with the table or the clustered index. Preexisting indexes are available for users for concurrent DML operations. The target structure is the new index that is created or rebuilt. The temporary mapping index comes into the picture during the creation of the clustered index. This nonclustered index is created in the same step as the new clustered index (or heap) and does not require a separate sort operation. Concurrent transactions also maintain the temporary mapping index in all INSERT, UPDATE, and DELETE operations.

The online index operation can be divided into three phases:

  1. Preparation
  2. Build
  3. Final

The Build phase is a longest phase of all. It is in this phase where the creation, dropping, or rebuilding of indexes take place. The duration of the Build phase depends on the size of the data and the speed of the hardware. Exclusive locks are not held in this phase, so concurrent DML operations can be performed during this phase. The Preparation and Final phases are for shorter durations. They are independent of the size factor of the data. During these two short phases, the table or the indexed data is not available for concurrent DML operations. Now let's discuss these three phases in detail.



Preparation Phase

During the preparation phase, the following activities take place:

  1. A snapshot of the source table is acquired. When taking a snapshot of the table, a shared lock is obtained on the table. This is done for transactional level read consistency. An Intent share lock is also obtained. This lock is maintained until the final phase. In addition, a resource lock INDEX_BUILD_INTERNAL_RESOURCE is acquired. This prevents the execution of concurrent DDL operations on the source and preexisting structures while the index operation is in progress. For example, this lock prevents the concurrent rebuild of two indexes on the same table at the same time.
  2. Metadata is created. The logical metadata consists of Index ID, Index name, keys, and attributes. This is held in memory until the final phase. When an index is rebuilt online, two versions of the same index with the same name and ID exist during the rebuild operation—the original index and the new "in-build" index. The new index remains in an "in-build" state until the final phase of the online index operation. Concurrent DML operations are not allowed during this phase.


Build Phase

During the build phase:

  1. The new index structure created in the preparation phase is populated with the sorted data from existing data source.
  2. The shared lock acquired in the preparation phase is released. Concurrent DML operations can be performed during the Build phase.
  3. If the clustered index is created or rebuilt online, a temporary nonclustered index, called the mapping index, is created in the build phase.
  4. The mapping index is used by concurrent DML transactions to determine which records to delete in the new indexes that are being built when rows in the underlying table are updated or deleted. The mapping index is discarded when the index build process commits. The mapping index is not created if the clustered index is rebuilt or created offline. The mapping index also does not come into picture if a nonclustered index is being rebuilt or created.

When a clustered index is created or dropped and non-clustered indexes exist for that table, the preparation and build phases are repeated twice; once for the new clustered index and again for the nonclustered indexes, which are all rebuilt in a single operation to incorporate the change in the definition of the base structure of the clustered index or heap.



Final Phase

The Final phase is used to inform the system that the index is ready. These are the following activities that take place during this phase:

  1. The IS lock is upgraded to a Schema Modify (Sch-M) or Shared(S) lock depending on the type of index operation. A shared lock is acquired when a new nonclustered index is created. If a Sch-M lock is obtained, conflicting index operations are prevented from starting by using the special index-operation lock resource INDEX_BUILD_INTERNAL_RESOURCE.
  2. If a clustered index is created, then all the nonclustered indexes are rebuilt. Due to the locks held on the table, concurrent DML operations are not allowed on the table at this time.
  3. The Sch-M or S lock is released once the index is in the ready state.



Transact SQL

Online indexing can be performed through the following T-SQL statements.

  1. Create Index

    The syntax of Create Index is:

    CREATE [ UNIQUE ] [ CLUSTERED | NONCLUSTERED ] INDEX index_name
    ON ( column [ ASC | DESC ] [ ,...n ] )
    [ INCLUDE ( column_name [ ,...n ] ) ]
    [ WITH ( [ ,...n ] ) ]
    [ ON { partition_scheme_name ( column_name )
    | filegroup_name
    | default
    }
    ]
    [ ; ]

    ::=
    {
    [ database_name. [ schema_name ] . | schema_name. ]
    table_or_view_name
    }

    ::=
    {
    PAD_INDEX = { ON | OFF }
    | FILLFACTOR = fillfactor
    | SORT_IN_TEMPDB = { ON | OFF }
    | IGNORE_DUP_KEY = { ON | OFF }
    | STATISTICS_NORECOMPUTE = { ON | OFF }
    | DROP_EXISTING = { ON | OFF }
    | ONLINE = { ON | OFF }
    | ALLOW_ROW_LOCKS = { ON | OFF }
    | ALLOW_PAGE_LOCKS = { ON | OFF }
    | MAXDOP = max_degree_of_parallelism
    }

    When we specify Online = ON, the clustered index is created online.

    Example:

    USE [AdventureWorks]
    GO
    CREATE CLUSTERED INDEX [PK_Employee_EmployeeID] ON [HumanResources].[Employee] ([EmployeeID] ASC)
    WITH (PAD_INDEX = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = ON) ON [PRIMARY]
    GO
  2. Alter Index

    Syntax of Alter Index

    ALTER INDEX { index_name | ALL }
    ON
    { REBUILD
    [ [ WITH ( [ ,...n ] ) ]
    | [ PARTITION = partition_number
    [ WITH (
    [ ,...n ] )
    ]
    ]
    ]
    | DISABLE
    | REORGANIZE
    [ PARTITION = partition_number ]
    [ WITH ( LOB_COMPACTION = { ON | OFF } ) ]
    | SET ( [ ,...n ] )
    }
    [ ; ]

    ::=
    {
    [ database_name. [ schema_name ] . | schema_name. ]
    table_or_view_name
    }

    ::=
    {
    PAD_INDEX = { ON | OFF }
    | FILLFACTOR = fillfactor
    | SORT_IN_TEMPDB = { ON | OFF }
    | IGNORE_DUP_KEY = { ON | OFF }
    | STATISTICS_NORECOMPUTE = { ON | OFF }
    | ONLINE = { ON | OFF }
    | ALLOW_ROW_LOCKS = { ON | OFF }
    | ALLOW_PAGE_LOCKS = { ON | OFF }
    | MAXDOP = max_degree_of_parallelism
    }

    ::=
    {
    SORT_IN_TEMPDB = { ON | OFF }
    | MAXDOP = max_degree_of_parallelism
    }

    ::=
    {
    ALLOW_ROW_LOCKS= { ON | OFF }
    | ALLOW_PAGE_LOCKS = { ON | OFF }
    | IGNORE_DUP_KEY = { ON | OFF }

    | STATISTICS_NORECOMPUTE = { ON | OFF }
    }

    The DBA can rebuild or reorganize the indexes of the underlying table depending on the fragmentation level. If the fragmentation level is between 5% and 30%, it is enough to reorganize the indexes. This operation is similar to DBCC INDEXDEFRAG statement. Reorganizing an index is always an online operation.Example:

    Now let's find the fragmentation level of the indexes of the table "HumanResources.Employee" in the AdventureWorks Database.

    SELECT * FROM sys.dm_db_index_physical_stats(DB_ID('AdventureWorks'),Object_ID('HumanResources.Employee'),NULL,NULL,NULL



    To defragment the indexes, we will rebuild all the indexes.

    ALTER INDEX ALL on HumanResources.Employee REBUILD WITH (ONLINE=ON)
  3. Drop Index

    DROP INDEX
    { [ ,...n ]
    }

    ::=
    index_name ON
    [ WITH ( [ ,...n ] ) ]

    ::=
    {
    [ database_name. [ schema_name ] . | schema_name. ]
    table_or_view_name
    }

    ::=
    {
    MAXDOP = max_degree_of_parallelism
    | ONLINE = { ON | OFF }
    | MOVE TO { partition_scheme_name ( column_name )
    | filegroup_name
    | "default"
    }
    }

    Example:

    In the below code, the clustered index is dropped online.

    DROP INDEX PK_Employee_EmployeeID on [HumanResources.Employee] WITH (ONLINE=ON)

    Thus, we have seen how online index operations can be performed by using T-SQL statements.



    Disk Space Considerations

    The disk space required for performing offline index operations and online index operations varies. Additional disk space is required for sorting operations and storing row versions. A temporary mapping index is additionally created in case of clustered index. This temporary index consumes additional disk space when the user creates, drops, or rebuilds the clustered index. In case of creation of nonclustered index, a temporary mapping index is not created. Sorting operation is performed when a clustered or non-clustered index is created. The sorting operation can be done in TEMPDB database. This can be done by setting the SORT_IN_TEMPDB=ON. This option increases the amount of temporary disk space required, but the time required for creation of new index online can be reduced by having the TEMPDB and the user database on different physical disks.


    If the SORT_IN_TEMPDB is set to ON, then TEMPDB must have enough space to accommodate the index creation or rebuild operation.



    Performance Considerations

    The online index operations will take longer if concurrent update activity is very heavy. Typically, online index operations will be slower than equivalent offline index operations regardless of the concurrent update activity level.

    The source and the target structures have to be maintained during the online operations. So the resource usage increases for the DML operations. This can cause performance degradation.



    Best Practices

    1. Backup the transaction log and truncate it before running large-scale index operations online. This will prevent the transaction log from becoming full.
    2. It is recommended to have the SORT_IN_TEMPDB option set to ON. This will separate the index operations and the user transactions. Performance can be improved by having the TempDB and user databases in different physical disks. By separating the user transactions and index operations, the DBA can truncate the user database in midst of the index operations, if required.
    3. Have the recovery model set to SIMPLE or bulk logged so that minimal logging of index transactions takes place.
    4. Do not run the online index operation in an explicit transaction. The log of the user database cannot be truncated until the explicit transaction ends.


    Conclusion

    Online indexing is a powerful feature in SQL Server 2005, but should be used judiciously. Online Indexing should be considered in 24-7 environments where it is difficult to perform standard index maintenance operations.

    1 comment:

    1. very nice source of information.
      Like it. thanks

      ReplyDelete

    Recent Posts

    Archives