在SQL Server中,Index是BTree(balance tree)结构,每个Page之间都有双向指针链接在一起。Index是在table结构之外,独立存在的存储结构。Index能使查询性能带来飞跃的主要原因是:Index 结构更小,能够更快加载到内存;Index ey物理顺序和逻辑一致,数据的预读取能够提高数据的加载速度,SQL Server 每次读取操作都会将物理物理相邻的多个Page一起加载到内存。
BTree结构决定 Index 的叶子节点,从左到右使依次增大,如图是Index的叶子节点,左边的Index Key最小,右边的Index Key最大:
如果更新数据导致index key变化,例如,将index key 由4变更为9,那么必须将9放置在8之后,10之前,如果8所在的Page有空间容纳9,那么SQL Server只需要将9移动到8之后,原来的4被删除,这会降低原page中数据存储的密度,造成一个碎片(fragment),即:3和5之间存在空闲空间,但是物理顺序和逻辑顺序还是一致的。
如果8和10所在的page不能容纳9,那么 SQL Server 选择最节省,最有效的方式:拆分Page。试想,如果不拆分page,那么,5,6,7,8 这几个数据行都要向前移动,为9腾挪空间。在SQL Server中,数据移动是十分浪费IO,内存和CPU资源的,IO必须在CPU的调控下进行。
拆分Page是指分配一个新的Page,将8所在的Page上的数据的一半(后一半,或前一半)移动在新的Page上,如图,将page中的后一半移动在新的page上,通过指针连接在一起,保持逻辑顺序的一致性,但是物理顺序已经不连续了。
对于Index Key移动之后,其物理顺序和逻辑顺序仍然保持一致,这会导致索引出现碎片,数据存储的密度降低;而拆分page,不仅将page存储数据的密度降低一半,而且数据的物理顺序和逻辑顺序,导致SQL Server的预读取操作效果下降。针对Index的这两种情况,根据Index的碎片率,对Index 进行重组(Reorganize)或重建(Rebuild)。
1,Reorganize 和 Rebuild 的区别
Rebuild 是重新创建,将Index之前占用的空间释放,重新申请空间来创建index。Reorganize 是重新组织,将index的叶子节点进行重新组织。
Rebuilding an index means that a whole new set of pages is allocated for it。
Reorganizing an index compacts the leaf-level pages back to their original specified fillfactor ant then rearrages the pages at the leaf level pages to correct the logical fragmentation, using the same pages that the index originally occupied.No new pages are allocated.
ALTER INDEX { index_name | ALL } ON <object> REBUILD | REORGANIZE
2, 重建索引
在重建index时,将使用Index的定义(index key,index type,唯一属性和排序方向),重建索引是能够将一个被disabled的索引启用,重建一个聚集索引时,不会将与之关联的nonclustered index一起重建,除非指定all关键字,all关键字是指将object上的所有index重建。
Specifies the index will be rebuilt using the same columns, index type, uniqueness attribute, and sort order. This clause is equivalent to DBCC DBREINDEX. REBUILD enables a disabled index. Rebuilding a clustered index does not rebuild associated nonclustered indexes unless the keyword ALL is specified.
If ALL is specified and the underlying table is a heap, the rebuild operation has no effect on the table. Any nonclustered indexes associated with the table are rebuilt.
The rebuild operation can be minimally logged if the database recovery model is set to either bulk-logged or simple.
使用Rebuild 重新创建index时,如果没有指定Index Option,Rebuild使用默认的索引选项来重建index。
If index options are not specified, the existing index option values stored in sys.indexes are applied. For any index option whose value is not stored in sys.indexes, the default indicated in the argument definition of the option applies.
在sys.indexes 视图中,共存储5个index option,分别是 ignore_dup_key,fill_factor,is_padded,allow_row_locks,allow_page_locks,其他5个Index Option的Default value都是“否定的”,分别是
SORT_IN_TEMPDB : Default OFF STATISTICS_NORECOMPUTE : Default OFF DROP_EXISTING: Default OFF ONLINE: Default OFF DATA_COMPRESSION : Default NONE MAXDOP: 0
查看 sys.indexes 存储的Index option
select i.object_id,i.name as IndexName,i.index_id,i.type,i.type_desc, i.data_space_id,i.is_disabled, --Unique Property i.is_unique, --Constraint i.is_primary_key, i.is_unique_constraint, --Filter Index i.has_filter, i.filter_definition, --Index Options i.ignore_dup_key, i.fill_factor, i.is_padded, i.allow_row_locks, i.allow_page_locks from sys.indexes i
3,重组索引
REORGANIZE
Specifies the index leaf level will be reorganized. ALTER INDEX REORGANIZE statement is always performed online. This means long-term blocking table locks are not held and queries or updates to the underlying table can continue during the ALTER INDEX REORGANIZE transaction. REORGANIZE cannot be specified for a disabled index or an index with ALLOW_PAGE_LOCKS set to OFF.
Appendix
ALTER INDEX cannot be used to repartition an index or move it to a different filegroup. This statement cannot be used to modify the index definition, such as adding or deleting columns or changing the column order. Use CREATE INDEX with the DROP_EXISTING clause to perform these operations.
When an option is not explicitly specified, the current setting is applied. For example, if a FILLFACTOR setting is not specified in the REBUILD clause, the fill factor value stored in the system catalog will be used during the rebuild process. To view the current index option settings, use sys.indexes.
On multiprocessor computers, just like other queries do, ALTER INDEX REBUILD automatically uses more processors to perform the scan and sort operations that are associated with modifying the index. When you run ALTER INDEX REORGANIZE, with or without LOB_COMPACTION, the max degree of parallelism value is a single threaded operation. For more information, see Configure Parallel Index Operations.
An index cannot be reorganized or rebuilt if the filegroup in which it is located is offline or set to read-only. When the keyword ALL is specified and one or more indexes are in an offline or read-only filegroup, the statement fails.
Rebuilding Indexes
Rebuilding an index drops and re-creates the index. This removes fragmentation, reclaims disk space by compacting the pages based on the specified or existing fill factor setting, and reorders the index rows in contiguous pages. When ALL is specified, all indexes on the table are dropped and rebuilt in a single transaction. FOREIGN KEY constraints do not have to be dropped in advance. When indexes with 128 extents or more are rebuilt, the Database Engine defers the actual page deallocations, and their associated locks, until after the transaction commits.
Rebuilding or reorganizing small indexes often does not reduce fragmentation. The pages of small indexes are stored on mixed extents. Mixed extents are shared by up to eight objects, so the fragmentation in a small index might not be reduced after reorganizing or rebuilding it.
In SQL Server 2012, statistics are not created by scanning all the rows in the table when a partitioned index is created or rebuilt. Instead, the query optimizer uses the default sampling algorithm to generate statistics. To obtain statistics on partitioned indexes by scanning all the rows in the table, use CREATE STATISTICS or UPDATE STATISTICS with the FULLSCAN clause.
In earlier versions of SQL Server, you could sometimes rebuild a nonclustered index to correct inconsistencies caused by hardware failures. In SQL Server 2008 and later, you may still be able to repair such inconsistencies between the index and the clustered index by rebuilding a nonclustered index offline. However, you cannot repair nonclustered index inconsistencies by rebuilding the index online, because the online rebuild mechanism will use the existing nonclustered index as the basis for the rebuild and thus persist the inconsistency. Rebuilding the index offline, by contrast, will force a scan of the clustered index (or heap) and so remove the inconsistency. As with earlier versions, we recommend recovering from inconsistencies by restoring the affected data from a backup; however, you may be able to repair the index inconsistencies by rebuilding the nonclustered index offline. For more information, see DBCC CHECKDB (Transact-SQL).
Reorganizing Indexes
Reorganizing an index uses minimal system resources. It defragments the leaf level of clustered and nonclustered indexes on tables and views by physically reordering the leaf-level pages to match the logical, left to right, order of the leaf nodes. Reorganizing also compacts the index pages. Compaction is based on the existing fill factor value. To view the fill factor setting, use sys.indexes.
When ALL is specified, relational indexes, both clustered and nonclustered, and XML indexes on the table are reorganized. Some restrictions apply when specifying ALL, see the definition for ALL in the Arguments section.
参考文档:
Reorganize and Rebuild Indexes